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Preface 



This book combines two related topics which are usually covered in separate texts, 
namely logic and integer programming (discrete optimisation). These two subjects 
have close connections and each is applicable to the other. 

Much of the work which makes up this book is due to others as well as the 
present author, in particular John Hooker and the late Robert Jeroslow. John Hooker 
has written a number of comprehensive books on the subjects which are referenced 
in the appropriate places. 

This book is shorter than these and intended to give a readable and succinct 
coverage of the subjects. It is intended for students (and practitioners and academics) 
in operational research, computer science and mathematics. 

Logic, by definition, involves a high level of abstraction as it is intended to sep- 
arate interpretations of logical systems from the structure of those systems. This 
can make for difficulties in comprehension as understanding is usually motivated by 
using concrete examples. In this book we have placed the emphasis on comprehen- 
sion by motivating the discussion by interpretations while pointing out when we are 
doing this. 

We avoid the usual format of definitions, theorems and proofs by introducing 
concepts and results within the text (usually italicised) by examples. References are 
given at the end of each chapter to further (often more mathematical) papers and 
texts on topics. 

Optimisation problems (in practice linear and integer programming) are not usu- 
ally presented in a logical context. However, doing this enables one to give their 
statement, as models, a greater precision. This often throws greater insight into 
their mathematical nature and properties. In addition logic is a valuable tool for 
modelling, and sometimes solving such models. 

Chapter 1 gives a basic introduction to logic and its aims as well as explain- 
ing the propositional and predicate calculus. Chapter 2 explains linear and integer 
programming (LP and IP) using the machinery of logic. It also explains the funda- 
mental structural and mathematical properties of these types of models. The main 
methods of solving IP models are described. The most common types of IP model 
restrict the variables to 0 or 1 (with the obvious logical false/true interpretation). 
The major areas of practical application are explained. Also a section is devoted 
to the attempt to distinguish between computationally ‘easy’ classes of problem 
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and ‘difficult’ classes of problem under the subject of ‘computational complexity’ . 
Chapter 3 applies logic to the formulation of IP models using the methods explained 
in Chapter 1. There are often ‘good’ and ‘bad’ ways of modelling IPs (from both the 
explanatory and ultimately solvability points of view). Also the deeper mathematical 
concepts involved (e.g. ‘convexification’) are covered here. Chapter 4 covers the 
fundamental problem of computational logic, namely the ‘satisfiability problem’. 
This problem lies at the heart of most of what is covered in this book. Methods 
of solving this problem through both logic and integer programming are given and 
their connections described. Applications are given in a number of diverse fields. It 
is also shown how IP models can be expressed as satishability problems and solved 
as such. 

Readers familiar with either logic or IP or with both may be able to skip 
appropriate sections. However, it is expected that all chapters will contain mate- 
rial not familiar to everybody and they are recommended to read the book in its 
entirety. If it is desired to pursue a particular topic in greater depth then it is hoped 
that the references at the end of each chapter will provide a route into the topic. 
The references are therefore intended for this purpose. 1 am aware that there are 
many research papers not referenced, covering signihcant work. I apologise, in 
advance, to any authors who might feel slighted, but the intention of the book is 
to explain and provide pursuable introductions to topics rather than comprehen- 
sively cover all work done. Exercises are given at the end of each chapter. These 
are intended to reinforce, and sometimes expand on, the material in the chapter. As 
with many mathematical subjects attempting the exercises is often the best way of 
understanding. 

An obvious question, which a reader might sensibly ask, is “what is prescribed 
as the best method of solving a practical problem’’. No dehnitive answer is given 
in this book. Different problems demand different approaches. However, what is 
needed is a thorough knowledge of basic logic for the purpose of clarifying and 
modelling discrete optimisation problems. Also needed is a knowledge of how linear 
and integer programming can be used to solve the resultant models, often enhanced 
by using logical methods. It is hoped that the material in this book will help in both 
these areas. 

I am indebted to a number of friends and colleagues for reading and commenting 
on sections of this book. In particular I would like to mention Gautam Appa, Nikos 
Argyris, John Hooker and John Wilson. 

Also 1 would like to acknowledge the help I have received from Carol Hewlett, 
Nikos Argyris and Mara Airoldi in the use of the software Scientific Workplace, 
Latex and IPE in the preparation of this book. 

Einally I would like to acknowledge the help which resulted from Lever- 
hulme Research Eellowship RE&G/10185 and EPSRC Overseas Travel Grant 
EP/C530578/1 in preparing this book. 



Winchester, UK 



H. Paul Williams 
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Chapter 1 

An Introduction to Logic 



1.1 The Purpose of Logic: Philosophical: Computational 

Traditionally logic has been concerned with the /orm of statements as opposed to 
their content. The aim is to produce a system in which it is possible to deduce true 
statements from other ‘true’ statements independently of what one is talking about, 
i.e. independently of the interpretation of the statements. The usual approach is 
to start with a set of axioms and rules of deduction and produce statements which 
are true if the axioms are true. Such statements are said to be analytically true, 
as opposed to statements which are regarded as true on the basis of experimental 
evidence in, e.g. sciences such as physics, biology, psychology, etc. A major 
philosophical aim in the late 19th and early 20th centuries was to use logic to put 
mathematics on a rigorous footing. It was then hoped that mathematical statements 
could be regarded as having an absolute truth in this sense. The pattern which it was 
hoped to follow was that done by Euclid for geometry which was very successful. 
Russell and Whitehead devoted an enormous amount of time to trying to axiomatise 
mathematics. They regarded mathematics as simply an extension of logic. Unfortu- 
nately their efforts were largely in vain as a result of two discoveries, mainly due to 
Godel. 

First it was shown that it was impossible to prove the consistency of the axiom 
system without using methods which went beyond the mathematical system itself. 
A meta system was needed which was richer than the original system. If a system is 
not consistent then, in any worthwhile form of logic, one can prove any statement 
(and its negation) rendering the system vacuous. 

Second it was shown that, however, many axioms one had there would always 
be ‘true’ statements which could not be proved (unless one added new axioms in 
which case further ‘true’ statements could be found). 

The confusion between ‘truth’ and provability was responsible for the difficul- 
ties in formalising mathematics. The former is a less well-defined concept than 
the latter. As a result of the failure of Russell and Whitehead’s agenda, attention 
turned to less ambitious aims of only putting ‘parts’ (fragments) of mathematics 
on a rigorous footing (e.g. restricted forms of set theory). Some of these fragments 
have proved useful in a computational setting (e.g. arithmetic without multiplication 
and the Theory of dense linear order). The purpose of this book is to use logic for 
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modelling and computational purposes. The methods of logic can be used to solve 
less ambitious problems than the formalisation of all mathematics. The problems 
which we use it for, in this book, arise mainly in operational research and computer 
science. They are often (but not always) optimisation problems. 

However, the efficiency of the computational methods used depends first on how 
the models are built and second the methods used, some of which rely on logic. 
We will mainly be concerned with discrete optimisation, i.e. where some quanti- 
ties are restricted to taking discrete values, as opposed to a continuum of (say) the 
real or rational numbers. It is not always obvious, with any one problem, to what 
extent one uses logic or to what extent one uses more traditional methods. There is 
great advantage, however, in being able to move between the two and recognise the 
relationships between them. In this sense discrete optimisation (usually known as 
integer programming) and logic are symbiotic. 



1.2 Logical Inference and Consistency 

As already stated logic is concerned with deducing statements from other state- 
ments. This is the process known as inference. We formalise this in Sects. 1.3 and 
1 .4 where we represent statements symbolically and show how to manipulate them. 
Sometimes this is referred to as symbolic logic. 

For example 



Is it valid to make the following inference ? (1.1) 

Glasgow is in Scotland and England and Scotland are part of Britain 
Manchester is in England or Scotland (1.2) 

Therefore 



Manchester and Glasgow are both in Britain (1.3) 

Note that (1.1) is a question which depends on the structure of the sentences and 
rules of deduction to be used, i.e. it is not a question which needs a knowledge of 
geography to answer. Also it is a question about the system of statements we are 
working with, not a statement within the system we are dealing with (which (1.2) 
and (1.3) both are). It is said to be a statement within the meta system as opposed to 
the system. The meta system is the language for talking about the system. Attempts 
to encompass a meta system within the language of the system is what led to a lot of 
the paradoxes (e.g. Russell’s paradox) in mathematics. In order to distinguish meta 
statements from statements we will use the symbol ‘=J>’ to represent inference. By 
saying A B we are meaning that from the statement (or set of statements) A 
we can infer the statement B. The set of premises in (1.2) can be represented by A 
and the conclusion (1.3) represented by B. Consistency is concerned with whether 
a system can ever produce a contradiction, i.e. can we infer both a statement and 
its negation? Again consistency is a meta concept. It is a property of a system not 
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a statement within a system. Consistency and inference are closely related to each 
other. A set of statements is only consistent if we cannot infer a contradiction from 
them. A conclusion can only be inferred from a set of premises if the negation of 
the conclusion is inconsistent with the premises. 

Another important property of a logical system consisting of statements, axioms 
and rules of deduction is completeness. A system is complete if every statement in 
the system can either be proved to be true or false. 

As a result of Godel’s work it was found that a system rich enough to encompass 
full arithmetic could not be shown to be either consistent or have an axiom system 
which makes it complete. However, the system that we will be using most, in this 
book, namely the propositional calculus (also known as Boolean algebra) is both 
consistent and can be given an axiom system which makes it complete. Also the 
richer system, known as the Predicate Calculus is consistent and has a complete 
axiomatisation. 

In this book we will not be using an axiomatic treatment. As we are more con- 
cerned with applying logic and integer programming for practical (as opposed to 
philosophical. Important though this is) purposes we will not be concerned with 
proving the above properties or otherwise. We will use more intuitive methods when 
necessary. 

There is a way of viewing inference which connects with another theme of 
this book. That is optimisation. Linear and integer programming (introduced in 
Chapter 2) are concerned with maximising or minimising linear expressions subject 
to linear constraints (in real or integer numbers, respectively). As such they are 
attempting to find the strongest linear constraints which can be inferred from the 
original constraints. Therefore, optimisation can be viewed as a method of infer- 
ence. When a logical verihcation problem is cast into the form of an optimisation 
problem it can be solved as such. This is done in Chapter 3. Alternatively integer 
programming models can be cast as logical inference problems as done in Chapter 4. 

A mathematical or logical system is said to be decidable if there exists a ‘mechan- 
ical’ procedure for deciding the truth or falsity of a statement in the system. Such a 
procedure is known as a decision procedure or an algorithm. Again the methods of 
Godel and others demonstrate that full arithmetic is not decidable. Some ‘smaller’ 
systems are also undecidable. However, the propositional calculus, which we are 
mainly concerned with in this book, is decidable. The predicate calculus is not 
decidable although there are decision procedures for always deciding if a statement 
is false (but not if a statement is true). Also appending elements to the predicate 
calculus can produce decidable systems as described in Sect. 1.5. 



1.3 The Propositional Calculus 
1.3.1 Connectives and Truth Tables 



This system is sometimes called "Boolean algebra". It combines atomic statements, 
which can take truth values true (T) or false (F) , into compound statements whose 
truth depends, by the rules of this calculus, on the truth values of the component 




4 



1 An Introduction to Logic 



atomic statements. We will represent atomic statements by literals A, B,X\, X 2 , 
etc. and their negations by A, B,X\, X 2 , etc. (the former set of literals will be 
said to have ‘positive sign’ and the latter set of literals ‘negative sign’). Both the 
individual letters and their negations are known as literals. The atomic statements 
and compound statements are combined by connectives to produce more compound 
statements whose truth value depends on the truth values of the component state- 
ments. The connectives which we will initially consider are ‘v’ (‘or’) , ‘-’ (‘and’) , 
‘— ’ (‘not’) , ‘ — >’ (‘implies’) and (‘if and only if’). The effects of these are 

given in the truth table below. 



A 

T 

T 

F 

F 



Table 1.1 The Truth Table for Common Connectives 



B 

T 

F 

T 

F 



AV B A - B 
T T 

T F 

T F 

F F 



A A — > B 
F T 

F F 

T T 

T T 



B 



T 

F 

F 

T 



Table 1.1 can be taken as the definition of these connectives. When the truth table 
is applied to a compound statement we have a decision procedure for determining 
the truth or otherwise of the statement. We use this as an alternative to the axiomatic 
approach. However, there are often more efficient decision procedures (e.g. integer 
programming) for particular problems which are discussed in subsequent chapters of 
this book. Some discussion should be given to the nature of the connectives defined 
in Table 1.1. Since these are the definitions they need not, necessarily, totally con- 
form to common usage, ‘v’ is sometimes called the ‘inclusive or’ since it is also true 
when both the component statements are true (as opposed to the ‘exclusive or’ which 
is not true if both components are true). ‘ — does not represent implication in any 
causal sense. In particular it is defined as true when neither component statement is 
true, when in practice it might be thought to be inapplicable. It is only false when 
the first statement is true and the second false. The connective ‘ ^ ’ is sometimes 

called ‘equivalent to’ since the two component statements must have the same truth 
value for it to be true. It must be emphasised again that these connectives are within 
the system. For example ‘=^’ is a meta relation about statements in the system 
although it obviously has a semantic correspondence with ‘ — ^’. Similarly 
should not be confused with the meta relation ‘=’ which means two statements 
within the system are equivalent to each other. 

Apart from ‘— ’ (which can be regarded as a connective applying to one state- 
ment, which is written above the statement) all the other connectives in Table 1.1 
connect two component statements. Obviously there are 16 (2“*) possible ways of 
assigning the two truth values T and F to the four rows of a truth table connecting 
two statements. One of these is to assign T to each row indicating what is known 
as a tautology' (always true). Another is to assign F to each row indicating what 
is known as a ‘contradiction’ (always false). There are, however, a number of other 
connectives (9 in total) which we have not defined in Table 1 . 1 but which may be 
useful in some circumstances. 
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1.3.2 Equivalent Statements 

It is easy to verify, by means of truth tables (and is left as Exercise 1.7.2), that the 
following equivalences hold: 



A- B = B ■ A 
Av B = B V A 
A ■ {B w C) = {A ■ B)y {A ■ C) 

A V {B.C) = (A V 5) • (A V C) 

A - B (AvB) 

Aw B (A ■!}) 

A — > B = Av B 
A B = (A — > B)- (B — > A) 
A — > B = B — ^ A 
A-A= F 
AvA= T 

A= A 
A-T = A 
A - F = F 
AwT = T 
Aw F = A 
A ■ A = A 
A V A = A 



(1-4) 

(1-5) 

( 1 - 6 ) 

(1.7) 

( 1 - 8 ) 

(1-9) 

(1.10) 

( 1 - 11 ) 

(1.12) 

(1.13) 

(1-14) 

(1.15) 

(1.16) 

(1.17) 

(1.18) 
(1-19) 
( 1 . 20 ) 
( 1 - 21 ) 



Note the use of the meta symbol for the equivalence between statements. 

Equations (1.4) and (1.5) are known as the commutative laws and (1.6) and (1.7) 
as the distributive laws. (They are analogous to similar laws for addition and multi- 
plication in arithmetic, although if ‘v’ is analogous to addition and to multipli- 
cation (1.7) does not apply.) Equations (1.8) and (1.9) are known as De Morgan’s 
laws. They demonstrate a symmetry between the ‘v’ and connectives. These laws 
enable one to manipulate expressions into equivalent expressions and standardise 
them into normal forms as described below. Repeated application of De Morgan’s 
laws to a statement, in order to negate it, changes all the ‘v’ connectives to and 
vice versa and negates the unnegated literals and unnegates the negated ones. The 
resultant statement is said to be the logical dual of the original and vice versa. 
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Equation (1.12) is known as the contrapositive. Intuitively it makes sense. 
B is True only if A is true. Therefore it B is not true then A cannot he true. 
This transformation is very useful in certain integer programming applications as 
described in Chapters 3 and 4. 

Compound statements that are always true for any truth settings of the atomic 
statements (such as (1.14) ) are known as tautologies. In contrast compound state- 
ments that are always false (such as (1.13)) , for any truth settings of the atomic 
statements, are known as contradictions. 

In order to aid readability it is convenient to regard the connective as more 
binding than the other connectives. This enables one to dispense with the brackets 
in the right-hand expression in (1.6) and the left-hand expression in (1.7). Note that 
in many texts the symbol ‘A’ is used instead of and negation of a statement 
A is indicated by ~ A instead of A. We use our notation to aid readability but 
sometimes use ~ instead of — when we wish to negate a compound statement 
instead of a literal and wish to avoid long cumbersome bars across the top of an 
expression. 



1.3.3 Disjunctive and Conjunctive Normal Forms 

Let us consider a truth table for a statement S with n component statements referred 
to as Ai, A 2 , . . . , A„. We represent this in Table 1.2. 



Table 1.2 A General Truth Table 
Ai A2 A„ S 

T T ... T - 
T T F - 



F 



F 



F 



Let S have the value T for rows ii, h, . . . ,ir and L for all the other rows and 
the entries for Ai , A?, . . . , A„ in row i , are T for columns k,- , 1 , k, 12 , .... k 

■' J’J J’J h'-(/ 

and L for columns k n. + 1 , k + 2 , ... , ki. j” . Then we can represent row ij by the 

ijjj ijJj ^ 

statement 



^k. .1 




Pij ' Pi.+\ 

'j’‘j ‘j’‘j 




Pi . +2 



( 1 . 22 ) 



This is easily verified by the truth table for since, for it to be true, the atomic 
statements A^. , must take the values specified for this row of the truth table. 
Equation (1.22) is known as a conjunctive clause since the connective is also 
known as a ‘conjunction’ . 
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Since S takes the value T for each of the rows t'l , t 2 , ■ . • , ir S will be true if 



^k. ,1 


^k. .2 ' ' 

'D'l 


" ' Pi 

'D'l 


* Ak . 1 * At , -> * * 

^ Pi. +1 ^ Pi, +2 

Hy4'i 'l.'l 


• • • Aj^. 

^'14^ 




V Ak^ 
'2^2 


* ^k. .2 ' ' 
'2^‘2 


" ' Pi 

'2 42 


^k py +1 ^k Pi.^+2 

'242 '242 


' * * A^. 

S42 
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(1.23) 



i.e. if any of the conjunctive clauses is true. Statement (1.23) is known as a disjunc- 
tion of the clauses since they are combined by the ‘v’ connective and ‘v’ is also 
known as a ‘disjunction’. 

Statement (1.23) is said to be in extended disjunctive normal form (EDNF). For 
a statement to be in FDNF it is made up as a disjunction of conjunctive clauses 
each of which contains literals (atomic statements which are negated or unnegated) 
for all atomic statements in the original statement (1.23) serves to prove that any 
statement in the propositional calculus can be written using the connectives 
‘v’ and ‘-’.Of course there are many other ways of writing a statement either using 
just these connectives or other connectives. However, FDNF also provides a stan- 
dard (if sometimes uneconomical) and unique way (up to the order of the literals 
and clauses) of expressing any statement. The following example demonstrates its 
use 

Example 1.1 Express statement S as defined in Table 1.3 using EDNF. 



Table 1.3 A Function Defined by a Truth Table 



A 


B 


C 


S 


T 


T 


T 


F 


T 


T 


F 


F 


T 


F 


T 


T 


T 


F 


F 


F 


F 


T 


T 


T 


F 


T 


F 


T 


F 


F 


T 


T 


F 


F 


F 


T 



Taking rows 3, 5, 6, 7 and 8 of the table gives the statement 

ABCvABCvABCvABCvABC (1.24) 

Often we can be content with (non-extended) disjunctive normal form (DNF). 
Here we again express a statement as a disjunction of conjunctions but do not insist 
that each conjunction contains literals for each atomic statement in the original state- 
ment. Such a form lacks uniqueness but will generally be more compact. Converting 
it into the simplest form is addressed in Chapter 4. 
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Example 1.2 demonstrates how the laws (1.4) - (1.21) can be used to convert a 
statement into EDNF. 

Example 1.2 Convert the following statement into EDNE: 

(A — >B)~K-C (1.25) 

Using the equivalencies given above we can successively convert (1.25) to 

(AvB)-A-C (1.26) 

A-CvABC (1.27) 

This statement is in DNF but not in EDNF since the first clause does not contain 
the literal B . We can expand it to 

ABCvABC (1.28) 

using equivalence (1.14). Equation (1.28) also contains the second conjunctive 
clause in (1.27) and is therefore the equivalent statement to (1.25) in EDNF. 

There is another normal form which is widely used. This is extended con- 
junctive normal form (ECNF). It consists of a conjunction of disjunctive clauses 
(each containing literals for all atomic statements in the extended case). We 
can use De Morgan’s laws to convert a statement from EDNF to ECNF and 
vice versa. However, there is a more straightforward way of giving a truth 
table representation of a statement in ECNF as well as one in EDNF. We 
observe that the negation of the statement is represented by those rows of 
the truth table corresponding to the rows for which the statement is false. For 
example, for Table 1.3, this corresponds to rows 1, 2 and 4 giving rise to the 
statement 



A - B ■ Cv A - B ■ Cv A- B ■ C (1.29) 

Negating this statement gets us back to the original statement which we wish to 
represent. We can negate (1.29) using De Morgan’s laws (1.8) and (1.9). The effect 
is to create the logical dual of (1.29) which is 

(AvBvC)-(AvBvC)-(AvBvC) (1.30) 

It can be checked (Exercise 1.7.3) that this statement (in ECNF) also represents 
Table 1.2. The rule for converting a truth table representation to a statement in ECNF 
is therefore to take each row of the truth table for which the statement is false, 
create a disjunction of the literals in it by negating those literals corresponding to a 
T entry and not negating those entries corresponding to an F entry and then taking 
the conjunction of the resultant (disjunctive) clauses. 
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If a statement is in non-extended DNF or CNF (i.e. each clause does not nec- 
essarily contain all literals) it is cumbersome to expand it into EDNF or ECNF in 
order to use the truth table representation to create the other normal form. Instead 
we can use De Morgan’s laws to convert to the other form. We demonstrate this by 
the following example. 

Example 1.3 Convert the following statement in DNE into CNE: 



ABC w B ■ D V A- D (1.31) 

Using the distributive law (1.7) we combine one literal from each conjunctive 
clause into a disjunction in all possible ways and form the conjunction of these 
disjunctions (also using the simplifications (1.14) and (1.21)). This gives 

(Av B) - (Av B V D) - A - (AvB vD) - (Av B V O - (Av C V D) (1.32) 

This statement is in far from its simplest form but the topic of ‘simplification’ is 
postponed until Chapter 4. 

It is often much more economical to express in either DNF or CNF depending on 
the statement and application. Obviously, in order to do the reverse transformation 
and convert a statement in CNF into DNF, we can use the (logically dual) form of 
De Morgan’s laws (see Exercise 1.7.4). 

In order to show the potential complexity of such transformations we consider 
another example. 

Example 1.4 Convert the following statement in DNE into CNE: 

Xi ■ X2 V X3 ■ X4 V Xj ■ Xfi V ■ ■ ■ V Z2„_i ■ X2„ ( 1 . 33 ) 



This results in the following conjunction of disjunctions: 



(Zi V X3 V X5 . . . V X2„_3 V X2„_i) 
• (Xi V X3 V X5 . . . V X2„_3 V X 2 n) 



■ (Z2 V X4 V Xe . . . V X2„_2 V X2„_l) 

■(X 2 V X 4 V Xfi . . . V X 2„_2 V X 2 „) (1.34) 

It can be seen that instead of the n (conjunctive) clauses of 2 literals each we now 
have 2" disjunctive clauses of n literals each. What’s more no simplification is pos- 
sible. Clearly, in this case, DNE is more compact. Also the amount of computation 
necessary to carry out the transformation is an exponential function of the number 
of variables. However, in other cases (e.g. the dual of (1.34)) CNF may be more 
compact. This is obviously an ‘extreme’ example but in more general cases the two 
forms may be very unbalanced in the size of their representations. 
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There is another, ingenious, approach to transforming statements in DNF to CNF 
and vice versa which reduces the computational complexity by introducing new 
literals. We demonstrate this by again using Example 1.4. 

We introduce new literals j representing the conjunctions Xi ■ Xj in (1.33). 
Equation (1.33) can now be written as 



^1,2 V ^ 3,4 V Xs,6 V • • • V X2n-3,2n-2 V X2n-l,2n 



(1.35) 



i.e. as a disjunctive clause. 

We must also model the conditions 

Xij ^ Xi ■ Xj (1.36) 



These may be written as 



(Xij V Xi) ■ (Xij V Xj) (1.37) 

Hence we have the conjunction of (1.35) with all the pairs of disjunctive clauses 
in (1.37) for each pair of i and j in each literal in (1.35). This is clearly a statement 
in CNF. It contains 3n literals and 2n + 1 disjunctive clauses. One of them has n 
literals and the rest 2 literals each. 

Exercise 1.7.6 involves converting the dual expression to (1.33), i.e. a statement 
in CNF, to a statement in DNF in an analogous manner. 



1.3.4 Complete Sets of Connectives 

We have shown that, by means of EDNF or ECNF, we can express any statement 
using only the connectives ‘v’ and They are said to be a complete set of 
connectives in the sense that any compound statement can be written using them 
alone. It is, in fact, possible to suffice with only the set of connectives {— , v). 

In order to show this we can apply De Morgan’s law (1.8) in order to replace all 
occurrences of in a statement by — and V. 

Similarly the set of connectives {— , •} form a complete set as can be 
seen by eliminating all occurrences of ‘v’ in a statement using De Morgan’s 
law (1.9). 

However, it is also possible to represent all statements using a single connec- 
tive. There are two such complete connectives (of two atomic statements). They are 
represented by the symbols ‘j,’ and ‘|’ and are referred to as the connective arrow 
and the Sheffer stroke, respectively. They are defined in the following truth table 
(Table 1.4).' 

In intuitive terms ‘j,’ and ‘|’ represent the logical connectives ‘nor’ and ‘nand’ 
(‘not and’), respectively. They are sometimes manufactured as logical ‘gates’ for 
electrical circuits, as discussed in Chapter 4, in view of their completeness. 
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Table 1.4 Truth Tables for the Connective Arrow and Sheffer Stroke 



A B A i B 

T T F 

T F F 

FT F 

F F T 



A I B 
F 
T 
T 
T 



In order to demonstrate their completeness we can show, by means of truth 
tables, that and ‘v’ can be expressed using them. We can demonstrate 



(Exercise 1.7.7) the following equivalences: 

A = A,(A (1.38) 

A- B = {A i A) i(B i B) (1.39) 

Aw B = (A i B) i{A i B) (1.40) 

A = A|A (1.41) 

A ■ B = (A\B)\(A\B) (1.42) 

Av B = {A\A)\{B\B) (1.43) 



Exercise 1.7.8 is to prove that these are the only possible complete connec- 
tives combining two variables. There are, however, many more complete connec- 
tives for combining three, four and more variables, but this is beyond the scope of 
this book. 



1.3.5 The Calculus of Indications 

This is an ingenious method of constructing the propositional calculus using one 
symbol only. The method has found some application in the design of electrical 
circuits. It also leads to great notational economy. The single symbol which we use 
is The originator of the method intended it to stand for a "distinction’ which he 
felt was the most fundamental useful concept possible. Erom this primitive notion he 
constructed his calculus. We will not discuss the philosophical aims of the originator 
but we will describe the mechanics of the system. 

The symbol ’ can be combined with itself in two ways: 

11 =1 (1-44) 

1^ = (1-45) 

The ‘empty’ symbol in (1.45) is deliberate. Writing the symbol above itself in 
(1.45) cancels it out, whereas juxtaposing it in (1.44) results in itself. 
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If we have a structure such as 



(1.46) 

we can reduce it down to ’ or ‘ ’ (the empty symbol). Any expression can be 

simplified down to one of these two symbols. Exercise 1.7.9 involves simplifying 
this and other expressions. 

We can represent general expressions built up from by (meta symbols) 
a,b, . . . . If known these expressions will simplify to ’ or ‘ ’ . We can give them the 
interpretations F and T, respectively, i.e. a,b, . . . can be interpreted as statements 
which are false or true according to whether they reduce to ’ or ‘ ’. 

The expressions a,b, . . . can be incorporated into enhanced expressions, e.g. 



(1.47) 

Here the symbol ’ can be interpreted as a connective applying to any number 
of statements. For example a ] is a connective applying to one statement. 





is a connective applying to two statements. 



(1.48) 



is a connective applying to three statements. 

Since the expressions a,b,c,. . . each reduce to (F) or ‘ ’ (T) we can create 

truth tables for the application of . 

It can be verified that 



a a] 
TF 
F T 



since if a reduces to (empty) (T) then a^ reduces to ] (F). However, if a reduces to 
] (F) then reduces to (T). Clearly, applied to one statement, ] performs to role of 
the — connective. 
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Applying ] to two statements it can be verified that 



a b a b 
TT F 
T F T 
FT T 
F F T 



This is the ‘ | ’ (Sheffer stroke) (nand) connective defined above. As demonstrated 
there, it is a complete connective. It can be verified (Exercise 1.7.10) that ’ applied 
to any number of statements yields a complete connective for that number of state- 
ments. Identities (1.41) - (1.43) show how any statement can therefore be modelled 
using For example instead of modelling ‘-’in the form of (1.42) we can write it 



as ab .‘v’ can he modelled ^ ^ I as in contrast to (1.43). 

Exercise 1.7.1 1 involves using this ‘multivalued’ connective to represent a num- 
ber of statements. It has the advantage that it is not always necessary to repeat the 
same literal (which also does not need to be negated) , as is necessary in expres- 
sions such as (1.42) and (1.43). One can also develop normal forms for compound 
statements using this symbol. 



1.3.6 Venn Diagrams 

These are a useful way of interpreting and representing compound statements. Each 
atomic statement corresponds to a region of the plane. We represent all statements 
hy a ‘universal’ region drawn as the rectangle in Fig. 1.1. 




Fig. 1.1 A Venn diagram 



The left-hand circle represents the statement A being true. The ‘complementary’ 
region to this circle represents A being true, i.e. A being false. The right-hand circle 
represents the statement B being true. The intersection of these circles represents A ■ 
B being true and the union of the circles represents Av B being true, i.e. ‘ v’ models 
the union (U) operation of Elementary Set Theory and ‘-’ models the intersection 
(n) operation, while - models the complement operation. 
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Using Fig. 1.1 it is easy to verify, for example, De Morgan’s laws. The region 
outside the two circles is modelled by A V B. It is easy to see that this is also the 
intersection of the complement of A with the complement of B which is modelled 
by A ■ B . 

Figure 1.2 demonstrates how extended disjunctive normal form can model a 
statement. 




Fig. 1.2 A second Venn diagram 

All eight disjoint regions are modelled by the eight conjunctive clauses made 
up from A, B, C and their negations. A is represented by the top circle, B by the 
left circle and C by the right circle. We have used the symbols A, B, C in two 
senses. On the figure they represent regions with the set operations ‘U’, ‘O’ and 

’ (complement) applied. In the propositional calculus statements they represent 
atomic statements (indicating membership of the corresponding region) with the 
corresponding connectives ‘v’, and ’ (‘not’) (using the same symbol as for the 
set complement). 



1.4 The Predicate Calculus 

While the propositional calculus is sufficient for many purposes it is not sufficient 
to formalise all mathematics. The predicate calculus was created for this purpose. 
In this system statements can be made about objects by means of predicates, e.g. 
P(x, y). P(x, y) could stand for a statement such as 

‘x is the father of y' (1.50) 

If the variables x and y are set to constants such as ‘James ’ and ‘Mary ’ we have 
the statement P(James, Mary) , which will be true or false meaning 



‘James is the father of Mary' 



(1.51) 
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Predicates can be of one, two, three or any number of variables. The process of 
setting variables to constants is known as instantiation. Each variable in a predicate 
has a domain of possible values. In addition to the predicates we have the connec- 
tives of the propositional calculus for combining them into compound predicates. 
Instantiation of all the variables in the predicates of an expression converts it to a 
statement in the propositional calculus. 

For example if we have the statement 

{P{x,y)y Q{z))^ R{x,y,z) (1.52) 

we might instantiate it to 

{Pia,b)-V Q{c)) R{a,b,c) (1.53) 

so long as a, h and c were in the appropriate domains. Its truth, or otherwise, will 
depend on the truth values of the atomic statements in the standard way. 



1.4.1 The Use of Quantifiers 

A major strength of the enhanced expressive power of the predicate calculus is the 
use of quantifiers. These are written as ‘V’ and ‘3’. ‘V’ means ‘for all’ and ‘3’ 
means ‘there exists’. If, for example, we have 'ixQ{x) this means ‘for all objects in 
the domain of x, 2(x) is true’. This can then be regarded as an atomic proposition 
of the propositional calculus. Likewise the statement 3x Q{x) means ‘there exists an 
object in the domain of x for which Q{x) is true’. 

The following statements are usually stipulated in the form of axioms of the 
predicate calculus but we simply state them as valid deductions and equivalences in 
our informal treatment. They obviously accord with the meanings we have given to 
the quantifiers. They enable us to manipulate and translate statements in the calculus 
(into, for example, normal forms). As before we assume that the objects (constants) 
a,b,c, . . . fall within the permitted domains of the variables x, y, z, . ■ . where they 
are substituted 



VxQ(x) = 


Q(a) 


(1.54) 


Q(a)^ 


> BxQ(x) 


(1.55) 


'Vxg(x) = 3x'g(x) 


(1.56) 


~3xQ(x) = Vx~Q(x) 


(1.57) 



Note that we are using the symbol ‘"’ as an alternative to ‘— ’ solely for nota- 
tional convenience. Equations (1.56) and (1.57) are obviously generalisations of 
De Morgan’s laws. 
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If the domain of a predicate is finite (e.g. a, b, . . . , n) then we can use the con- 
nectives and ‘v’ in place of ‘V’ and ‘3’, e.g. 

^xQ{x)= Q(a)- Q(b) Q{n) (1.58) 

BxQ(x)= Q(a)v Q(b)v ■■■V Q(n) (1.59) 

Besides the ability to quantify over inhnite domains (e.g. the natural, real or 
rational numbers) the new notation also has great advantages in helping us to create 
clear and succinct expressions even if the domains are hnite. It can be very useful in 
modelling as is illustrated in Chapter 3. 

The predicate calculus is sometimes referred to as ‘hrst-order theory’ since it 
allows us to quantify the objects within predicates but not ‘higher order’ objects 
such as the predicates themselves. 

Each quantiher has a scope over which it applies, e.g. 

^x{T {x , y) \/ U {x , z)) ■ ~W {s , z) (1.60) 

is itself a predicate of the variables y, z, s. The scope of 3x is the first disjunction 
of T and U and is indicated by bracketing them. It does not extend to W. The 
variable x is said to be bound by the quantiher. It is a dummy variable which could 
equally well be replaced by another variable and the meaning be unchanged. To 
save confusion it is clearer not to use such a variable elsewhere, beyond the scope 
of the quantiher. Variables that are not bound are said to he free, y, z, s are free in 
the above expression. Since W does not involve x there would be no ambiguity in 
extending the scope of 3x to the whole expression by moving the last bracket after 
U to the end of the expression. 



1.4.2 Prenex Normal Form 

Using (1.56) and (1.57) we can move all the quantihers to the beginning of an 
expression creating what is known as prenex normal form. We illustrate this by 
an example. The expression which is quantihed will be known as the core. 

Example 1.5 Transform the following expression into prenex normal form and the 
core into DNF: 

Vxi(3x2/?(li, L 2 )) — ^ 3x3(5'(x3, X 4 , X 5 ) ■ Vx6T(x 4, xe)) (1-61) 

Note that only variables xi,X 2 ,X 3 and xe are bound. Therefore the expression is 
itself a predicate of the free variables X 4 and xs. We can successively transform 
(1.61) into 
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Vxi(3x2(P(xi, X2)) — 


3x3(5'(x3, X4, X5) ■ Vx6T(x 4, xe))) 


(1.62) 


Vxi(3x2(P(xi, X2)) — - 


3x3Vxe(5(x3, X4, X5) 


• T(x 4, xe))) 


(1.63) 


Vxi3x23x3Vx6(P(xi, 


X2)) ^ (5(X3,X4,Xs) 


• r(x4, xe)) 


(1.64) 


Vx 1 3x2 3x3 Vx6 CR (x 


1,X2 ))V( 5 '(X 3 ,X 4 ,X 5 )- 


r(X 4 ,X 6 )) 


(1.65) 



It is sometimes convenient to allow one quantifier to apply to a group of variables, 
e.g. 3 xi3jc 2 can be written as 3xiX2- Similarly for V. Also it can easily be verified 
that 3xiX2 means the same as 3x2Xi . Similarly for V. 

When reading a statement such as (1.65) it must be remembered that the quanti- 
fiers should be read from left to right. The left most quantifier applies to the state- 
ment (including quantifiers) to its right and so on. However, the order of 3 and 
V cannot be reversed. 3xVyF(x, y) does not mean the same as Vy3xP(x, y). To 
illustrate this let us take the domain of x and y as the integers and P(x, y) as the 
relation x ^ y. The first statement is clearly not true in this interpretation since there 
is no integer which is smaller or equal to all the rest. However, the second statement 
is true in this interpretation since for any integer there is always an integer which is 
smaller or equal to it (e.g. itself). One might conveniently represent such an integer 
as Xy indicating that it depends on the value of y, whereas for the first statement we 
were seeking an x which was independent of the value of y. 

In all interpretations of the predicate P(x, y) and the domains of x and y, 
3xVyP(x, y) is stronger than Vy3xP(x, y). We prove this below. 

Example 1.6 Show that, in general, 

3xVyP(x,y) Vy3vP(x, y) (1.66) 



If the premise holds then for all y there is an x such that P{x, y), i.e. a particular, 
single, X serves the purpose for all the y . (For the positive integers, with the predicate 
‘x ^ y’ for example, 1 serves the purpose.) Hence for a particular y the conclusion 
will also be true since the specific x chosen for the premise will also serve the 
purpose. 

In order to demonstrate the expressive power of the predicate calculus in a non- 
mathematical setting we consider the following example. 

Example 1.7 Express the following statement using the predicate calculus: 

You can fool all of the people some of the time (1-67) 

and some of the people all of the time, but 
you cannot fool all of the people all of the time 



Let the predicate P(x,y) be given the interpretation ‘x can be fooled at time y’. 
The statement can then be written as 
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3vWuP(u, n).Vi> 3 uP(u, v).~yv'iuP(u, v) (1.68) 

Note the temptation to use the ‘ — symbol before the last quantihed predicate. 
But this would not capture the true meaning. 

1.5 Decidable Fragments of Mathematics 

When the predicate calculus is used to formalise parts of mathematics it is neces- 
sary to append functions (e.g. , ‘log’) and constants (e.g. 2, — 1, tt) together 

with certain rules involving them (sometimes given in the form of extra axioms). 
Relations (e.g. ‘ >’, ‘coprime’ ) are represented by predicates. 

Although full arithmetic, formalised using the predicate calculus, is not decidable 
there are smaller ‘theories’ within it which are decidable. We illustrate this by two 
theories which are applicable to linear and integer programming. 

1.5.1 The Theory of Dense Linear Order 

The theory of dense linear order models order relations between variables and con- 
stants in a continuum such as the real or rational numbers. The ‘-I-’ operation, the 
real numbers, the ‘=’ and ‘<’ relations (predicates) together with the transitivity 
rule ((x < y)- (y < z) (x < z)) are appended to the machinery of the predicate 
calculus. It is applicable to linear programming (LP). The constraints of an LP can 
be regarded as predicates with the LP variables as variables in these predicates. 

It is easy to verify that we can express the ‘^’ relation ((x < y) v(x = y)). (Also 
we can express ‘>’ and ‘^’ but we choose to suffice with ‘<’ and ‘^’ for simplicity 
of exposition.) Repeated addition allows us to multiply variables by integers, e.g. 3x 
is given by x -|- x -f x. Besides writing expressions such as 2x -|- 3y ^ 5 we can 
include subtraction by, for example, writing x — y ^ 3asx ^ 3-t-y. Also we 
can allow multiplication by rational numbers by, for example, writing |x < 5 as 
2x < 15. For economy of representation we will use these derived relations and 
operations directly. 

We are now in a position to make statements in this system and ask if they are 
true. This is a decidability problem and is illustrated by the following example. 

Example 7.S Is the following statement true ? 

3xVy(((x -h y < 1) V (2x — 3y < 2) V (— x — 2y < -2))- 

((-X < -l)v(-2x-y <2))) (1.69) 

X, yeK 



We can use (1.56) and (1.57) to transform (1.69) to 



3x'3y((— X — y < — !•— 2x-|-3y ^ -2-x+2y ^ 2)v(x ^ l-2x-|-y ^ —2)) (1.70) 
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Expressing this in the form 

3x~ [3>’(— X — y < — 1 • — 2x + 3y ^ —2 • x + 2>’ ^ 2) 

v33;(x < 1 ■ 2x + < -2)] (1.71) 

y can be eliminated from each of the two conjunctive clauses preceded by 3y by a 
process known as the elimination of existential quantifiers. The hrst clause can be 
written in the form 



3y(-x + l<y ■ y<^x-^-y<-^x+l) (1.72) 

We now consider the meaning of the above statement. It is that there exists a 
number from a continuum (e.g. the reals or the rationals) that lies strictly between 
— X + 1 and both of |x — | and — |x + 1. The condition for this is that 

-x + l<^x-^ (1.73) 

and 

-x + l<-^x+l (1.74) 

Clearly (1.72) implies (1.73) and (1.74). But also (1.73) and (1.74) together imply 
(1.72) since we could take y as any number greater than — x+1 and less than or equal 
to the minimum of |x — | and — ^x + 1. In general we have to consider all pairs of 
inequalities where the variable to be eliminated is on opposite sides of the inequality 
sign ‘<’ (or ‘^’). If the quantified variable to be eliminated occurs in an equality 
relation the elimination is even easier. We can use the equality to substitute it out of 
all the other relations. Obviously we modify the resultant inequalities accordingly 
when the lower and upper inequalities before the elimination are different combina- 
tions of ‘<’ and We rewrite (1.73) and (1.74) as 

5 5 

- -X < — (1.75) 

3 3 

and 

- ^x < 0 (1.76) 

The second conjunctive clause in (1.71) can be written in the form 



3y(x ^ 1 • y ^ — 2x — 2) 



(1.77) 
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Since there is no lower limit to y the second inequality is true for all x and may 
be ignored giving the equivalent statement 

X < 1 (1.78) 

Statement (1.69) has therefore been reduced to 

5 5 1 

3x ( — X < — • — X < 0 V X < 1) (1.79) 

3 3 2 

which (using De Morgan’s laws) can be written as 

3x((^x < ^ V ^x < 0)- -X < -1) (1.80) 

or in CNF as 

5 5 1 

3x(-x ^ - ■ —X < —1) V 3x(-x ^ 0 • —X < — 1) (1-81) 

3 3 ' '2 

We now again eliminate the existential quantifier in both clauses to give 

I<lv0<-1 (1.82) 

Clearly both inequalities are false showing the original statement to be false. 



1.5.2 Arithmetic Without Multiplication 

We now consider a decision procedure for arithmetic without multiplication and 
present this in a form which is applicable to integer programming (IP) as discussed 
in Chapter 2. 

As in the theory of dense linear order we consider all terms of the form nx as 
shorthand for x + x + • • ■ + x (« times). The theory looks very similar to the above 
theory except for the fact that the domain of the variables is now restricted to the 
integers. This makes the theory ‘more difficult’ in the sense that a decision procedure 
is more complicated. Again the constraints in an IP can be regarded as predicates. 
We illustrate the procedure by an example. 

Example 1.9 Is the following statement true? 

3xy(-2x - 2y < -7 ■ 8x + 7y < 28 • -X < -1) (1.83) 

X, yeZ 



i.e. where the domain of the variables is the integers. 
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In order to eliminate y and its existential quantifier we write the inner formula in 
the form 



3yOi-2x + 7) < I4y < 2(-8x + 28) • -x < -1) (1.84) 

It is not now sufficient to observe that the expression to the left of the first inequal- 
ity is less than or equal to that on the right. There must also be a multiple of 14 
between them. This can be captured in a. finite way by writing 

7(-2x -h 7) •+• J < 2(-8x + 28) • 7(-2x -h 7) -h ^ = 0 (mod 14) (1.85) 

• — X ^ — 1 

where s = 0, 1, . . . , 13 

Since the left-hand expressions are multiples of 7 (by the congruence) , s must 
be also and can be written as It allowing (1.85) to be simplified to 

7(-2x -t- 7) 4- 7r < 2(-8x -f 28) • -2x -h t = 1 (mod 2) (1.86) 

where f = 0, 1 

Clearly the congruence can only be satisfied if t = 1. Hence the inequality 
reduces to 



2x < 0 (1.87) 

Equation (1.83) has therefore been reduced to 



3x(2x <= 0 • —X ^ — 1) 



( 1 . 88 ) 



which can be written as 



3x(2 ^2x^0) 



(1-89) 



Strictly following our rules we continue the procedure by eliminating x and its 
quantifier to give 



2 + w <— 0 ■ 2 4- U! = 0 (mod 2) 



(1-90) 



where w = 0, 1. 

Clearly the congruence implies w = 0 showing (1.90) to be false. Hence (1.83) 
is false. 

Exercise 1.7.16 is to show that (1.83) is true when the variables are ‘relaxed’ to 
be real (or rational). 

It should be pointed out that this is a particularly simple example. In gen- 
eral we cannot eliminate the implied congruences at each stage. They have to 
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be incorporated in the subsequent eliminations. Full details of the method, when 
applied to integer programming models, are referenced in Sect. 2.6. 



1.6 References and Further Work 

A good introduction to logic is Mendelson [81]. Another text is Shoenfield [101] . 
Also Langer [72] is a very clear text. Russell and Whitehead [96] give the results of 
their formalisation. Godel [44] presents his major results. A ‘popular’ description 
of Godel’ s work is Nagel and Newman [85]. 

The propositional calculus (Boolean Algebra) is due to Boole [18]. Truth tables 
were invented by Wittgenstein [124] and Post [90] who also investigated complete 
connectives. The Sheffer stroke is due to Sheffer [99] and the connective arrow due 
to Peirce [89]. 

The compact way of converting statements from DNF to CNF is due to Tseitin 
[107]. Wilson [123] also presents the method. 

The calculus of indications is due to Spencer-Brown [103]. 

The predicate calculus is usually attributed to Frege [39]. 

The decision procedure for the theory of dense linear order is due to Langford 
[73] and that for arithmetic without multiplication is due to Presburger [91]. 



1.7 Exercises 

1 . 7.1 Use a truth table to show that the following statement is a tautology: 

((A V B) • (A — > C) ■ (B — ^ C)) — > C 

1 . 7.2 Verify, by means of truth tables the equivalences (1.4) - (1.21). 

1 . 7.3 Check that (1.30) represents the statement in Table 1.2. 

1 . 7.4 Use De Morgan’s laws to convert the following statement to DNF: 

(Av BvC)-(BvCvD)-iAvBvD) 

1 . 7.5 Convert the statement in 1.7.4 into DNF by means of the distributive laws and 
verify, by means of truth tables, that the statements produced in 1.7.4 and 1.7.5 are 
equivalent. 

1 . 7.6 Convert the following statement into DNF by introducing new variables to 
represent the disjunctive pairs of statements: 
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(Xi V X2) ■ (Z3 V X4) ■ (X5 V Xe) ■ ■ ■ (X2„-l V X2n) 

1 . 7.7 Verify, by means of truth tables, (1.38) - (1.43). 

1 . 7.8 Show that ‘|’ and ‘J,’ are the only possible complete connectives of two vari- 
ables. 

1 . 7.9 Simplify the following expressions: 







1 . 7.10 Show that ] applied to any number of statements is a complete 
connective. 

1 . 7.11 Represent the following statements using only the connective ] : 

i. (AvBvC)-^(DvE) 

ii. (A — > (Bv O) (B ■ A) 
hi. (A - B -C)V B ■ A) 

1 . 7.12 Represent the statement 1.7.1 1 (i) using a Venn diagram. 

1 . 7.13 Represent the following statement using the predicate calculus: 

Some animals are mammals but not all animals 
are mammals although all mammals are animals. 

1 . 7.14 Represent the following statement by the predicate calculus: 
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Every even number greater than 2 is expressible as the sum 
of two prime numbers. 

Even numbers and prime numbers should be expressed in terms of more elemen- 
tary predicates. 

1.7.15 Is the following statement true? 

^xy'izix -b y <— z-2x — ?>y ^ z - x > z) 
if (i) X, yeK (ii) x, yeZ. 

1.7.16 Show that (1.83) of Example 1.9 is true if the variables are real (or rational). 




Chapter 2 

Integer Programming 



In this chapter we begin with a brief explanation of linear programming (LP) since 
integer programming (IP) is usually regarded as an extension of LP. Also most prac- 
tical methods of solving IP models rely on solving an LP model first. However, 
our discussion of LP will be brief since this is not the main theme of this book. 
Although LP models are often solved in the course of solving IP models we will, 
in the main, regard a method of solving LP models in this context as a ‘black box’ . 
However, we will give a (computationally inefficient) method of solving LP models 
which illustrates the use of the predicate calculus as discussed in Sect. 1.5 as well 
as demonstrating important properties of LP models which are relevant to IP. In 
Sect. 2.4 we discuss the comparative computational complexity of IP compared with 
LP. A list of references to the further study of LP is given in Sect. 2.5. 



2.1 Linear Programming 

A linear programme (LP) is a problem of maximising or minimising a linear expres- 
sion subject to a number of linear constraints which take the form of linear expres- 
sions being less than or equal to greater than or equal to or equal to ‘=’ 
given numbers. For example the following is an LP model. 

Example 2.1 A linear programme 
Maximise 



2x\ -h 3x2 — Xi 



subject to 



xi -h X2 < 3 

—xi -F 2x3 ^ ~2 
— 2xi + X 2 — x$ — 0 



( 2 . 1 ) 



( 2 . 2 ) 

(2.3) 

(2.4) 
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xi, X2, x$ ^ 0 (2.5) 

XI, X2, Xs€ K 



It is usual, but not necessary, for the variables to be restricted to be non-negative by 
the last set of inequalities. Sometimes LPs are expressed in standard form as max- 
imisations subject to all constraints (apart from the non-negativity constraints) 
or alternatively as minimisations subject to all constraints. The expression to be 
maximised or minimised is known as the objective function. Clearly it is possible to 
convert a maximisation to a minimisation (and vice versa) by negating the objective 
function. constraints can be converted to constraints by negating throughout 
(and vice versa). constraints can be converted to a pair of a and a con- 
straint simultaneously. We convert Example 2. 1 to a standard form by the following 
example. 

Example 2.2 Convert Example 2.1 to a standard form 
This becomes 
Maximise 



2x\ + 3x2 — X3 ( 2 . 6 ) 

subject to 



XI +X2 ^3 


(2.7) 


xi — 2x3 ^ 2 


( 2 . 8 ) 


—2xi -h X 2 — X 3 ^ 0 


(2.9) 


2xi — X2 + X3 ^0 


( 2 . 10 ) 


XI, X2, X3 ^ 0 


( 2 . 11 ) 


Xi,X2, X3€ M 





After converting the ‘=’ to a and a the has been negated throughout 
to make it a 

While a standard form is useful, for explanatory purposes, in practice models are 
usually kept in a more general form such as in Example 2. 1 . 

In order to make an LP model more precise we express Example 2. 1 using the 
predicate calculus. 

Example 2.3 Express Example 2.1 using the predicate calculus and solve it by the 
elimination of existential quantihers 
This becomes 

Eind the maximum value of z such that 
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3zXiL2X3 (z — 2xi — 3x2 + X 3 = 0 (2.12) 

• X1 + X2 < 3 

■xi — 2x3 ^ 2 
• — 2xi + X 2 — X 3 = 0 

• — xi ^ 0 

• - X 2 < 0 

• - X 3 < 0 ) 



Xl, X2, Xs€ K 



By means of the first equation z has been set equal to the objective function (2.6). 

We have deliberately quantified the variable z as we wish to test if the model has 
a feasible solution before seeking the maximum z, if it exists, i.e. we are first of all 
finding if the statement is true. 

Eliminating x\ using the first equation, by the method described in Example 1.8, 
we obtain 



3zX2X3(z — X2 + X3 <6 (2.13) 

•z — 3x2 — 3x3 ^ 4 
• — Z + 4x2 — 2x3 = 0 
• — Z + 3X2 — X3 < 0 

• — X2 ^ 0 

• - X3 < 0) 

Eliminating X 2 using the remaining equation 

3zx3(3z + 2x3 ^ 24 (2.14) 

•z — 18x3 ^ 16 

• — z + 2x3 ^ 0 

• — z — 2x3 ^ 0 

• — X3 ^ 0) 

Eliminating X 3 between the inequalities (and carrying out some simplification, 
including removing obvious redundancies) we obtain 



3z(z < 8 

• - z < 0) 



(2.15) 
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Eliminating z we obtain 



0 < 8 (2.16) 

which is clearly true, demonstrating that the original statement ( 2 . 12 ) is true (i.e. the 
LP is feasible). The maximum value of z satisfying (2.15) is 8 . Setting z = 8 the 
first inequality in (2. 15) is satisfied as an equation. Therefore so must be the first and 
last inequalities in (2.14), from which it is derived, making X 3 = 0. These inequal- 
ities, in turn, are derived from the equation and first inequality in (2.13) making 
X 2 — 2. These are derived from the two equations in (2.12) making xi = 1. 

Hence we have the optimal solution to the LP in Example 2.1 



xi = 1, X 2 — 2, X 3 = 0, Objective = 8 (2.17) 

The above method of solving LPs by successively eliminating existential quanti- 
fiers is not the most efficient algorithm. Other algorithms, in particular the Simplex 
algorithm, are mentioned in Sect. 2.5 and references given. It is, however, worth 
remarking that the above method shows LP to be a decidable theory, which was 
the purpose for which the decision procedure was designed. The simplex algo- 
rithm, as originally stated, does not do this as it cannot be guaranteed to termi- 
nate. It can, however, be modified fo guaranfee convergence. References are given 
in Sect. 2.5. The method described here is usually referred to as Eourier-Motzkin 
elimination. Although not computationally efficient it does prove to be a partic- 
ularly clear method of demonstrating important properties of LPs. Lurther algo- 
rithmic discussion is beyond the scope of this book but is fully discussed in the 
many texts on LP. We do, however, discuss those aspects of LP which are rel- 
evant to our main purpose of modelling and solving integer programming (IP) 
models. 



2.1.1 The Dual of an LP Model 

In Example 2.3 we obtained the optimal objective value by successively adding 
or subtracting the equality constraints, in suitable multiples, from the other con- 
straints to eliminate variables and adding the inequality constraints in suitable 
non-negative multiples. The net effect is to express the objective function as a 
linear combination of the constraints in such a way as to eliminate the variables. 
It can be verified fhat this results in multipliers of 1, |, 0, 0, 0, respectively, 

on the constraints which, when added together in these multiples, produces the 
constraint z ^ 8 . The effect of the multipliers is to eliminate all the variables 
(except z) and produce an upper bound on the possible objective values. We can 
formalise this in terms of the original model in Example 2.1, as seeking multipliers 
y\. yi. y^. I’l; V2, Vj on all the constraints (2.2)-(2.5) so as to 
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Minimise 



3yi-2y2 (2.18) 

subject to 



yi- y2- 2 y 3 - m = 2 


(2.19) 


yi + ys - V 2 = 3 


(2.20) 


2^2 - J3 - D3 = - 1 


(2.21) 


yi ^0, J2 < 0, V\, V2, V3 ^0 


(2.22) 



yi,yi, y$, ui, V2, v^e K 



Notice that y 2 is constrained to be non-positive and is not constrained in sign. 
This is because the second constraint is of the form (and therefore effectively 
subtracted from the other constraints if not converted to the form as is done in 
Examples 2.2 and 2.3) and the third constraint is an equation which may be added 
or subtracted from the other constraints. By minimising expression (2.18) we are 
seeking the smallest right-hand side b for expressions of the form z ^ b which 
are implied by the original constraints. This smallest value of b is obviously the 
maximum z which can be obtained. 

The variables n,- (the multipliers on the non-negativity constraints (2.5) stated in 
the form of ‘ constraints) are known as surplus variables. We can ignore them if 
we rewrite (2.18)-(2.22) as follows: 

Minimise 



3yi - 2y2 


(2.23) 


subject to 




yi - y 2 - 2ys ^ 2 


(2.24) 


yi + ys ^ 3 


(2.25) 


2y2 - y3 ^ -1 


(2.26) 


yi ^ 0, j 2 < 0 


(2.27) 


yi, y 2 , y3f R 




The model given by (2.23)-(2.27) is, of course, itself an LP model. It is known 
as the dual model. In contrast the original model, given in Example 2.1, is known 
as the primal model. If the primal model were in standard form as a maximisation 
subject to all the constraints being (apart from the non-negativity constraints on 




30 



2 Integer Programming 



the variables) then the dual model would also be in standard form as a minimisation 
subject to all constraints. 

The decision procedure, which we illustrated in Example 2.3, shows that the 
optimal solution to the dual model gives an attainable upper bound on the optimal 
objective value of the primal model. Hence, if both the primal and dual models 
have finite optimal solutions, their optimal objective values are the same (there is 
no "duality gap’). The optimal values of the dual variables are known as the dual 
values of the corresponding constraints in the primal model. It might be the case 
that the dual model is infeasible in which case the primal model may have no bound 
on its optimal objective value (it is said to be unbounded) or it might itself be infea- 
sible. If the primal model is unbounded then we can produce no upper bound on the 
primal objective value and therefore the dual model can have no solution and must 
be infeasible. All these results are part of the duality theorem of LP. The variables 
in the dual model have important economic interpretations in many applications as 
well as having computational applications. Of course duality is also of mathematical 
interest since there is a symmetry between the primal and dual models. The dual of 
the dual model is the primal model (Exercise 2.6.3). 

We should point out that the duality between LP models, which we have exhibited 
here, is different from the logical duality which we discussed in Chapter 1. 



2.1.2 A Geometrical Representation of a Linear Programme 

The variables in an LP model can be interpreted as the coordinates of points in space. 
The dimension of the space is equal to the number of variables in the model. We 
illustrate this by giving a geometrical representation of the model in Example 2.1. 

As all the variables are non-negative we restrict ourselves to the non-negative 
orthant. The constraints, in this example, restrict us to the triangular region A, B, 
C. This is known as the feasible region. Particular values for the objective function 
lie on planes with the orientation of that shown. Increasing values of the objective 
function arise as the planes are moved in the direction shown. However, we need 
a plane that intersects the feasible region. This happens, for this example, when 
an objective plane intersects the vertex B of the feasible region, whose coordinates 
x\ = 1, X 2 = 2, .V 3 = 0 therefore give the optimal solution (Pig. 2.1). 

In this particular example the feasible region is non-full dimensional, i.e. not of 
a dimension equal to the number of variables. It illustrates a number of features of 
the geometrical representation of LPs which generalise. 

The feasible region is a polytope. These are regions of space, in any number of 
dimensions, whose boundaries are hyperplanes, i.e. lines, planes or higher dimen- 
sional generalisations. They are of dimension one less than the dimension of the 
polytope. In this example the polytope of the feasible region is closed. In general, 
however, LP polytopes can be open, as illustrated by Example 2.5. Such polytopes 
are sometimes referred to as polyhedra. For an LP polytope the feasible region 
can be characterised by its boundaries, known as its facets or, alternatively by its 
vertices. 
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Different values of the objective function will be represented by hyperplanes in 
the space. Moving to parallel hyperplanes in a direction orthogonal to the hyper- 
planes will increase the objective value and in the other direction decrease the 
objective value. As in the example the optimal solution will normally correspond to 
where an objective hyperplane intersects a vertex of the feasible region. However, 
it may be the case that the orientation of the objective hyperplane is such that there 
are alternative optimal solutions not at vertices. But among the alternative optimal 
solutions there will still be vertex solutions (so long as the model is not infeasible 
or unbounded). 

Some of these different features of LP models are illustrated in examples below 
and in exercises in Sect. 2.6. We emphasise again that our purpose is to explain the 
nature of LP in so much as is necessary for the understanding of IP. There are many 
other texts that give a rigorous mathematical derivation of these properties, some of 
which are referenced in Sect. 2.5. 

We give some other examples of LP models 

Example 2.4 A. full-dimensional LP model 

Maximise 



— Ax\ + 5x2 + 3x3 



(2.28) 
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subject to 



—Xi + ^2 — X 3 ^ 2 


(2.29) 


XI + X2 + 2x3 ^ 3 


(2.30) 


XI, X 2 , X 3 ^ 0 


(2.31) 


xi, X 2 , X 3 eK 





This is illustrated in Fig. 2.2. Note that this model is in standard form. 




It can be seen that the feasible region has six vertices O, A, B, C, D and E 
with their coordinates marked. The orientation of the objective plane shows that 
the (unique) optimal solution arises from vertex C giving the solution xi = 0, X 2 = 
objective = y. 

Example 2.5 An LP with an open feasible region 
Minimise 



- Xl+ X2 



(2.32) 




2. 1 Linear Programming 



33 



subject to 



xi + 2x2 ^ 5 


(2.33) 


—2xi -b X 2 ^ —2 


(2.34) 


Xi,X2 ^ 0 


(2.35) 


xi, L2eK 





Note that this model is also in standard form. It is illustrated in Fig. 2.3. (When 
we have a minimisation model we adopt the convention that dual values on con- 
straints are non-negative, whereas dual values on constraints are non-positive, 
in contrast to these conventions being reversed for maximisation models.) 




Fig. 2.3 An LP model with an open feasible region 



The optimal solution occurs at vertex B giving xi = = |, objective = — i. 

If, however, the objective had been (say) 

Minimise 



- 3xi + X2 (2.36) 

then the model would have been unbounded since feasible solutions of ever- 
decreasing objective value could be found. 
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A polyhedron with an open feasible region (in any number of dimensions) is 
characterised by its vertices and its extreme rays. In the example AD and BC are 
extreme rays. If an LP is unbounded then the unboundedness is represented by an 
extreme ray such as BC, for this example, with objective (2.36). Note that points on 
this extreme ray give ever-decreasing values for objective (2.36). 

It has already been remarked that if an LP is unbounded the corresponding dual 
model is infeasible, i.e. the decision procedure illustrated in Example 2.3 will pro- 
duce no lower bound (in the case of a minimisation) on the objective (see Exer- 
cise 2.6.5). We give the dual model to Example 2.3 (with objective (2.36)). 

Example 2.6 An infeasible LP model 
Maximise 



subject to 



5yi - 2 j2 


(2.37) 


yi - 2y2 < -3 


(2.38) 


2yi + y2 ^ 1 


(2.39) 


yi , y2 ^ 0 


(2.40) 


yi, y2f R 





In Fig. 2.4 it is shown that the constraints above are self-contradictory. 

In order to satisfy constraint (2.38) (and the non-negativity constraints) we have 
to lie in the upper open region, but in order to satisfy constraint (2.39) we have to 
lie in the lower triangular region. These regions have no points in common. 

Note that the property of infeasibility is independent of the objective function. If 
a model is infeasible it has no solutions irrespective of the objective. 

It has already been remarked that if a model is unbounded its dual must be infea- 
sible. Exercise 2.6.6 is to show that the dual of Example 2.6 (which is infeasible) 
is Example 2.5 with objective (2.36) (which is unbounded). It was also, however, 
remarked that if a model is infeasible there is another possibility: its dual may also 
be infeasible. Exercise 2.6.7 demonstrates this. 

The method of solving LP models by the elimination of existential quantifiers, 
illustrated in Example 2.3, can be regarded in a geometrical context as a method 
of projection. Each time a variable is eliminated the model is projected down to an 
equivalent model in a space of one less dimension. The method can be valuable in 
reformulating both LP and IP models. 

LP models arise in many contexts such as production, distribution, blending, 
oil rehnery scheduling and chemical processing to name only a few. Models can 
involve millions of constraints and variables. In Sect. 2.5 references are given to 
comprehensive sources which discuss practical applications. 

Before discussing IP we should point out that there are important classes of 
apparent IP models which can be solved as LPs as this leads to integer values for the 




2.2 Integer Programming 



35 




Fig. 2.4 Constraints defining an infeasible LP model 



variables. In particular the minimum cost network flow model (with integer flows in 
and out of the network) and special cases such as the transportation and assignment 
problems have this property. They will not be discussed further here but references 
given in Sect. 2.5 discuss them. 



2.2 Integer Programmin g 

If all or some of the variables in an, otherwise, LP model are restricted to take 
integer values then we have a pure or a mixed IP (MIP) model, respectively. As in 
other branches of mathematics models involving integers are much more difficult 
to solve than models only involving real numbers. In Sect. 2.4 we discuss the issue 
of computational complexity and point out that IP models are generally much more 
difficult to solve than comparably sized LP models. 

In Sect. 2.3 we point out that in most practical models the integer variables are 
restricted to the two values 0 and 1. Hence the connection with the propositional 
calculus. In this section we confine our attention to solving pure IP models with 
general integer variables. It will be seen that the methods effectively specialise to 
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the pure and mixed 0-1 case. For completeness we solve an illustrative model by the 
decision procedure illustrated in Example 1.9. Then we discuss the more efficient 
methods employed in practice. We consider the following example. 

Example 2. 7 Solve the following IP model using the decision procedure for arith- 
metic without multiplication 
Maximise 

xi + X2 (2.41) 

subject to 

2x\ -\r 2 x 2 ^ 3 
—2x\ -\- 2x2 ^ 3 
4xi + 2x2 ^ 19 
X\,X2^Q 

X\, X2€ Z 

We write the model in the form 
Maximise z 
such that 

3zxiX2(z — xi — X2 — 0 (2.46) 

• — 2xi — 2x2 ^ —3 

• — 2xi -t- 2x2 ^ 3 

• 4xi + 2x2 ^ 19 

• — xi ^ 0 

• - X2 < 0) 

z, xi, X 2 e Z 

Eliminating xiusing the equation gives 

3zx2(-2z < -3 (2.47) 

• — 2z -1- 4x2 ^ 3 

• 4z — 2x2 ^ 19 



(2.42) 

(2.43) 

(2.44) 

(2.45) 
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• - Z + X2 < 0 
• - X2 < 0) 
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Since the variable eliminated (xi) has a coefficient of =pl in the equation in (2.46) 
no congruences are needed in (2.47). 

We rewrite this in the form 



3zX2( 


-2z 




-3 




4x2 


< 


2z + 3 


2(4z- 


19) 




4x2 




4X2 




4z 




■ 0 


< 


4X2) 



Eliminating X 2 reduces the system to 





3z(-2z 




-3 


2(4z- 


19) 3- 2s 




2z -3 3 


2(4z- 


19) 3- 2s 




4z 




■ 0 




2z-3 3 




• 0 


< 


4z 


• 4z 


- 19-h^ 


= 


0(mod 2)) 








^ 6 {0, 1} 



The congruence demonstrates that = 1 . 

Therefore (2.49) (after simplihcation) can be rewritten as 

3z(2z < 13 ■ -2z < -3) 

Eliminating z (and simplifying) gives 



(2.48) 



(2.49) 



(2.50) 



3 + t^l3-3 + t = 0(mod 2) 

t e {0, 1} (2.51) 

In order to satisfy the congruence f = 1. The inequality in (2.51) is then obvi- 
ously true showing the IP to be feasible. 

The maximum value of z satisfying (2.50) is 6 arising from the hrst inequality 
in (2.50). This inequality arises from the second and third inequalities in (2.48) 
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showing that 10 ^ 4x2 ^15 making X 2 = 3. These inequalities in turn arise from 
the equation and third and fourth of the inequalities in (2.47). The (unique) solution 
is therefore 



xi =3, X 2 = 3, objective = 6 (2.52) 

We now use the above example to illustrate the method usually used in practice 
for solving IP models. It is usually called the ‘branch-and-bound’ algorithm for 
reasons which will become apparent. 



2.2.1 The Branch-and-Bound Algorithm 

Example 2.8 Solve the model in Example 2.7 by the branch-and-bound algorithm. 

The hrst step in this method is to remove the integrality requirements and solve 
the resultant LP model. This is referred to as the LP relaxation. LP is computation- 
ally ‘easy’ compared with IP, as discussed in Sect. 2.5. If the optimal solution to 
the LP relaxation turns out to be integral then this will also be an optimal solution 
to the IP. There is an important class of models where this will always be the case. 
However, for this example, we obtain the fractional solution 

2 1 5 

xi = 2-, x 2=4-, objective = 6 - (2.53) 

3 6 6 

We now choose one of the integer variables (x) whose value has come out frac- 
tional (e.g. N + f where N is an integer and 0 < / < 1). There is no loss of 
generality in stipulating that either x^N or x'^N+ \ since x is not permitted 
to take any fractional value between N and N + \. Both these possibilities rule out 
the current fractional solution. For the example we choose the dichotomy xi ^ 2 or 
xi ^ 3. The choice of variable x\ over X 2 is arbitrary here. In practice it can be made 
heuristically, i.e. according to plausible (but not guaranteed) rules which might be 
hoped to produce the optimal integer solution as quickly as possible. For example 
it might be thought that since x\ is further from its closer integer than X 2 this will 
have more effect than choosing X 2 . It is convenient to represent this process as a 
branching in a tree as shown in Fig. 2.5. 

At each node we give the solution of the corresponding LP relaxation. Node 2 
corresponds to the original model together with the appended constraint x\ ^ 2. 
Node 3 corresponds to the original model together with the appended constraint 
xi ^ 3. We number the nodes in order of the creation of the corresponding models. 
Besides the choice of branching variable there is also a choice of order in which to 
solve the LP relaxations of the nodes created. It is convenient to solve each of the 
two resultant LP models (the ‘son ’ and ‘daughter’) immediately after their creation. 
It can be seen that the objective values corresponding to the LP relaxations of the 
two nodes have both got ‘worse’ (declined for a maximisation or increased for a 
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= 2, a :2 = 3.5, Ohjective = 5.5 = 2, a :2 = 3.5, Ohjective = 6.5 



Fig. 2.5 A branch in the branch-and-bound algorithm 



minimisation). This would be expected since extra constraints are being added. In 
fact they might not change if there were alternate solutions. 

At this stage we have a choice of waiting nodes to ‘develop’, i.e. choose which 
node to branch from. This choice can again be made arbitrarily or heuristically. 
For example we might choose to develop node 3 in preference to node 1 since its 
objective has declined less (and there is therefore more chance of a better integer 
solution down this branch). For our example we continue this process and illustrate 
it in Fig. 2.6. 

At node 5 the LP relaxation (and therefore the corresponding IP) is infeasible. 
This result is likely to happen as we are adding progressively stronger constraints 
down a branch. Once a node has become infeasible there is no sense in developing 
it further. It is said to have heen fathomed. Node 4 is then developed in preference to 
the other waiting node 2 if we use the heuristic that the objective of the LP relaxation 
at node 4 is better than that at node 2. At node 6 we obtain an integer solution with 
an objective value of 6. Again there is no sense in developing this node further which 
is also said to have been fathomed. Since the objective value is better than that at 
node 2 no better integer solution can be obtained by developing that node. It is also, 
therefore, said to have been fathomed. The value of the best integer solution found 
to date provides a bound (hence the use of this word in the title of the method) on 
the optimal objective value which we have used to terminate the branch at node 
2. Solving the LP relaxation at node 7 gives an objective value of 5 ^ which again 
is below the current bound allowing us to terminate this branch. Since there are 
no more waiting nodes the integer solution found at node 6 has been shown to be 
optimal. This is the solution already obtained from Example 2.7 and given in (2.53). 

Although we have obtained the optimal integer solution there is no guarantee that 
there might not be other (suboptimal) integer solutions if we had developed nodes 2 
and 7. Exercise 2.6.8 involves developing these nodes to see if there are other integer 
(non-optimal) solutions. 
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Fig. 2.6 A branch-and-bound tree 



Exercise 2.6.9 is to resolve this model by branching on variable X 2 instead of 
xi at the top of the tree. This will result in a totally different tree but it should be 
apparent that it must also result in the same optimal objective value and, in this 
example, the same (unique) optimal solution (although, in general, there could be 
alternate optimal solutions). 

A feature of this method, which should be remarked on, is that immediately after 
a branch the branching variable takes the value in the added constraint. This will 
always happen but there is no reason why the variable will not deviate from this 
value further down the tree, as for example happens at node 4. Once an extra con- 
straint has been imposed at a branch this extra constraint will apply at all branches 
below the relevant node. For example xi is greater than or equal to 3 at all nodes 
from node 3 and below. 

If an integer variable is restricted to a hnite range of values (as is normally the 
case in practice) then it should be apparent that this method must converge. In the 
worse case, the branching process will result in fixing the integer variables at all 
the (finite number of) possible values at different nodes. This could result in an 
exponential number of nodes. The potential number of nodes at each successive 
level doubles giving 2" nodes at level n. Exercise 2.6. 10 demonstrates this. However, 
if some of the integer variables are not restricted to lie in a finite range then there 
is no guarantee that the method will terminate. Example 2.9 demonstrates this. This 
shows that branch-and-bound is not formally a decision procedure for IP in contrast 
to the method illustrated in Example 2.7. 
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Notice (for example) that the right branch from node 1 (x\ ^ 3) and the left 
branch from node 4 (xi ^ 3) have fixed x\ at 3. 

It is useful to illustrate the branch-and-bound method, as performed in Fig. 2.6, 
geometrically in Fig. 2.7. 




Fig. 2.7 A geometrical representation of branch-and-bound 



ABCD represents the feasible region of the LP relaxation. The optimal solution 
of the LP relaxation corresponds to vertex C. By branching on xi we create the 
two regions ABEF and GHD corresponding to the LP relaxations at nodes 2 and 3. 
Vertices E and H correspond to the solutions at those nodes. Branching on X 2 at 
node 3 creates an empty feasible region for X 2 ^ 4 and the region GIJD for X 2 ^ 3. 
The solution at node 4 corresponds to vertex J. Branching on x\ creates the feasible 
regions GI (one dimensional) and KLD. The solutions at nodes 6 and 7 correspond 
to vertices 1 and L. I clearly corresponds to the optimal integer solution. 

Example 2.9 Demonstrate that branch-and-bound does not terminate with the fol- 
lowing model: 

Minimise 



subject to 



X2 



(2.54) 



3xi — 3x2 ^ 1 



(2.55) 
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4xi — 4x2 ^ 3 

Xi ^ 1 



(2.56) 

(2.57) 



Xi, X2 € Z 

The resulting branch-and-bound tree is given in Fig. 2.8. It should be apparent 
that the tree will never terminate producing LP relaxations with ever-increasing 
objective values. Figure 2.9 demonstrates what is happening. The LP relaxation 
at node 1 corresponds to point A. After branching on X2 we have alternate (all 
fractional) solutions to the resulting LP relaxation and choose (arbitrarily) that cor- 
responding to point B. Continuing in this manner we produce solutions at points 

C,D, The problem is that the IP has no integer solution (it is infeasible). But 

branch-and-bound cannot detect this as it keeps finding feasible solutions to the LP 
relaxations. As mentioned before, in practice, this is unlikely to happen since integer 
variables usually have finite bounds. But it can be the case that branch-and-bound 
will take an exorbitant amount of time with some models. Exercise 2.6. 1 1 is to solve 
this model by the decision procedure demonstrated in Example 2.7. This, of course, 
terminates finitely showing that there is no solution. 




Fig. 2.8 A non-terminating branch-and-bound tree 



If we apply branch-and-bound to a mixed IP then we only branch on those vari- 
ables restricted to take integer values. If the integer variables are restricted to take 
the values 0 or 1 (in both the pure and mixed cases), as is most common, then the 
branch-and-bound tree becomes simpler. Each branch fixes a variable at 0 or 1 in 
the next and all subsequent nodes down the branch. 
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Fig. 2.9 Geometric representation of non-terminating branch-and-bound tree 



2 . 2.2 The Convex Hull of an IP 

In Fig. 2.10 we draw the feasible region of the LP relaxation of Example 2.7 (with 
the same lettering as in Fig. 2.7). 

We also mark the points with integer coordinates inside this region. These are 
the feasible solutions to the IP in Example 2.7. Although the feasible solutions are 
disconnected, if we are using LP to help us solve the IP, we can surround these 
integer points by the smallest convex region containing them. This is known as the 
convex hull of feasible integer points. It is marked by the dotted lines in the figure. 
It is obvious that its boundaries must be straight lines in this two-dimensional case. 
In higher dimensions they will be hyperplanes. They could therefore be represented 
by linear constraints. In this example these are 



■^1 + .^2 ^ 2 


(2.58) 


XI ^ 1 


(2.59) 


-Xi -1- X2 < 1 


(2.60) 


X2 < 3 


(2.61) 


2x\ -h X 2 ^ 9 


(2.62) 
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Fig. 2.10 The convex hull of a pure IP 



Xi 

•T2 



5% 4 


(2.63) 


^0 


(2.64) 



If we were to use objective (2.41) and replace (or append) the constraints 
(2.42)-(2.45) by those above then the IP in Example 2.7 could be solved as a (much 
easier) LP. The optimal solution would be given by vertex I in Fig. 2.10. 

The boundaries of the convex hull are known as facets and the constraints giving 
rise to them are known as facet-defining constraints. 

In practice they may be very difficult to calculate. However, it may be possible to 
calculate some of them or approximate them by constraints which cut off some of 
the feasible points of the LP relaxation feasible region, but are not facets. All such 
constraints are known as cutting planes. Appending them to a model, when known, 
could be expected to reduce the size of the tree search in branch-and-bound. How- 
ever, another problem, which frequently arises, is that there may be an astronomic 
number of facets. This is the case with, for example, all known IP formulations of 
the famous travelling salesman problem (TSP) (which is discussed in Sect. 2.3). In 
practice one may only add them (or other cutting planes) iteratively when they are 
known to cut off the current fractional solution. 

Formulations that lead to constraints representing the convex hull are known as 
‘sharp’’ and those which are closer to the convex hull than others are known as 
‘tighter’. 

Such procedures can be Implemented as another method of solving IP models, 
i.e. solve the LP relaxation, add cutting plane(s) to remove the fractional solution 
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and keep repeating. Alternatively this approach can be implemented in a branch- 
and-bound tree (known as branch-and-cut) by applying it to the IPs corresponding 
to intermediate nodes in the solution tree. 

If a model is a mixed IP then one is only restricted to integer coordinates in 
some directions. If, for example, we were only to restrict x\ to take integer values 
in Example 2.7 the feasible solutions would be represented by the vertical lines in 
Fig. 2.11. The convex hull would be as represented by the dotted lines, and the 
optimal solution lies at vertex H. 




Fig. 2.11 Convex hull of a mixed integer programme 



The facet-dehning constraints for this example are then 



xi + 2x2 ^ 2 


(2.65) 


x\ ^ 1 


(2.66) 


— 2xi + 2x2 ^ 3 


(2.67) 


2x2 ^ 7 


(2.68) 


4xi + 2x2 ^ 19 


(2.69) 


xi < 4 


(2.70) 


X2 ^ 0 


(2.71) 
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The facet-defining constraints of an IP can be regarded as the strongest bounds on 
certain linear expressions which can be inferred from the constraints of an IP. This 
is analogous to the way in which LP can be regarded as an inference problem as 
mentioned in Sect. I.l. Hence both types of problem could be regarded as inference 
problems. For example suppose we were to choose the objective 

Maximise 

X2 (2.72) 

subject to the constraints of Example 2.7. The answer would clearly be X 2 = 3 as a 
result of the facet 2, i.e. the constraints imply X 2 ^ 3 but nothing stronger. 

It should be obvious that it is not possible to derive the constraint X 2 ^ 3 by an 
argument involving only multipliers on the constraints, leading to a dual model. The 
best that can be obtained by such an argument is X 2 ^ 4g. 4g is the optimal objec- 
tive value to the LP relaxation with objective (2.72). Viewed as inference problems 
finding the strongest implied constraints of an IP is therefore more difficult than 
simply adding constraints in certain multiples (the LP dual values). The difference 
between the optimal LP objective value and the optimal IP objective value is some- 
times known as the duality gap for an IP because the LP optimal value is the best 
that can be obtained by dual multipliers. It is an indication of the difficulty of an 
IP. Clearly it is the difference between the objective value for the LP relaxation at 
the top of a branch-and-bound tree and the objective value at the node giving the 
optimal IP solution. Therefore it is some (albeit crude) measure of the likely size of 
the tree. 

There is, however, a remarkable procedure for finding the strongest implied con- 
straints (the facet-defining constraints) of a pure IP. It consists of the repeated appli- 
cation of the following operations. (For simplicity we will consider the case of a 
maximisation model with constraints.) 

i. Adding constraints together in suitable non-negative multiples. 

ii. Dividing through by the greatest common divisor of the left-hand-side coeffi- 
cients and rounding down the right-hand-side coefficient. 

We illustrate this by reference to the constraints of Example 2.7. 

Dividing constraints (2.43) and (2.44) each through by 2 we obtain, respectively 

3 

-Xi +X2^f - (2.73) 

19 

2x\ + X 2 ^ — (2.74) 

These imply, since the left-hand expressions are integer. 



-X\ -h X2 < 1 
2xi -h X2 ^ 9 



(2.75) 

(2.76) 
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Multiplying (2.75) by 2 and (2.76) by 1 and adding we obtain 

3X2 < 11 



Dividing by 3 we obtain 



X2 < 



11 

y 



which, with rounding, implies 



X2 < 3 



(2.77) 



(2.78) 



(2.79) 



giving the facet-defining constraint. 

Unfortunately it is difficult to find a general way of carrying out this process 
systematically. Roundings are nested within the combining of constraints to an inde- 
terminate depth. The depth of rounding is significant and is known as the Chvdtal 
rank. Constraint (2.79) has Chvatal rank 2 since we had to apply two roundings to 
obtain it. 

Since the strongest bound on any objective function for an IP (such as (2.41)) 
can be obtained as an appropriate linear combination of the facet-defining con- 
straints (using LP dual multipliers) we can also obtain this using a nested combi- 
nation of rounding and linear combinations of the original constraints. The depth 
of the rounding in this case is referred to as the Chvatal rank of the model. This 
is often used as a measure of its difficulty. Most of the ‘easy’ problems of IP have 
rank 0 (i.e. they can be solved as LPs) or 1. However, most well-known difficult 
classes of IP problems have ‘unbounded’ rank, i.e. one can find problems in the 
class which have a rank of any number one chooses. Again the rank of the TSP is 
unbounded. 

It will be shown in Chapter 4 that one of the important methods of logical infer- 
ence, i.e. resolution, can be represented in an IP form by rank 1 cutting planes. 

Exercise 2.6.14 is to find the optimal bound on the objective of Example 2.7 as a 
Chvdtal function, i.e. a nested combination of rounding and linear combinations. 

In view of the importance of the special case of 0-1 IP models we illustrate the 
concept of the convex hull of a (small) pure 0-1 model. 

Example 2.10 Vmd the convex hull of the feasible solutions to the following 
constraints: 



Si -f ^2 + 2 S 3 < 2 (2.80) 

0 Si,S2,^3 < 1 (2.81) 

Si , S 2 , S3fZ 



Eigure 2.12 illustrates the polytope associated with the LP relaxation. 
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Constraint (2.80) is represented by the plane ABCD. It can be seen that the 
polytope of the LP relaxation has vertices 0,A,B,C,D,E and F. Hence optimis- 
ing an objective subject to (2.80) and (2.81) (ignoring the integrality require- 
ment) could lead to one of the solutions associated with the fractional vertices 
at B or D. 

The convex hull of feasible 0-1 integer solutions is shown in Fig. 2.13. 

The convex hull of the feasible integer solutions is defined by the constraints 



51 -b 53 < 1 (2.82) 

5 2 -b S 3 < 1 (2.83) 



together with (2.81). 

It can be seen that all the vertices have 0-1 coordinates. 

In general the feasible region associated with both the FP relaxation and the 
convex hull of a 0-1 IP model is a multidimensional hypercube with some of the 
corners sliced off. 
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Fig. 2.13 The convex hull of a 0-1 IP model 



2.3 The Use of 0-1 Variables 

As has already been stated the majority of integer variables in practical IP models 
are restricted to the two values 0 and 1 . If there are general integer variables then 
they would be restricted to a small integer range in almost any sensible IP model. 
Variables that, in reality, should be integer (e.g. number of cars manufactured or 
people of a certain category), which could take large values, would be treated as 
continuous and their solution values in the optimal solution rounded to give sensible 
answers. 



2.3.1 Expressing General Integer Variables as 0-1 Variables 

It should be pointed out that general (bounded) integer variables can always be 
expressed as a combination of 0-1 variables. We demonstrate by the following 
example. 

Example 2.11 Express the integer variable x (0 ^ x ^ f/) using 0-1 variables. 

The most compact way of doing this is a ‘binary’ expansion 



X — Xi “H 2x2 “t” 4x3 “b 8 x 4 “b * * * “t“ ^ ^ x\iog2^i ~b 1 



(2.84) 
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where xi, X 2 , ■ . . , xiiogjuj + 1 are 0-1 variables. 

Clearly by considering all possible 0 or 1 settings for these values we can repre- 
sent any integer less than 2 ^*°S 2 . If necessary we may also impose the bound 

xi -j- 2x2 4x3 4“ ^X4 -f- * * * 2^^^^^ ^"^XLioggf/j -{- 1 t/ (2.85) 



2.3.2 Yes/No Decisions 

The 0-1 variables are usually used to represent yes/no decisions which by their 
nature must be ‘discrete’, i.e. ‘do we build this warehouse here or not ?’, ‘do we 
make any of this product line or not ?’, etc. Other decisions will be contingent 
on these decisions, hence the need for ‘logical modelling’. We illustrate this by a 
number of examples. Systematic ways of modelling with 0-1 variables using logic 
are discussed in Chapter 3. Our purpose here is to show the modelling potential of 
IP and present some of the most common applications. 

We cannot do justice to this topic in the short space of one section but we aim 
to give a taste for the modelling potential and sketch the major application areas. 
Further references are given in Sect. 2.5. 



2.3.3 The Facility Location Problem 

Example 2.12 Formulale, the following (‘capacitated’) facility location problem 
as an IP. 

We can build warehouses at locations iel to supply customers (with some com- 
modity) at locations jeJ. Over the period considered (say a month) a warehouse 
at i has capacity Ki and the customer at j has requirement Dj . The monthly fixed 
cost of a warehouse at i is . The cost of supplying a unit of the commodity from 
the warehouse at i (if built) to the customer at j is cij . Choose where to build the 
warehouses and how much of the commodity to supply from each of them to each 
customer so as to minimise total cost. 

We can introduce the following variables: 

Xij — monthly quantity sent from warehouse i to customer j . 

Si = 1 if a warehouse is built at i. 

— 0 otherwise. 



CyXy + fiSi 
leljej iel 



Our mixed IP model is then 
Minimise 



(2.86) 
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subject to 



'^Xij ^ KiSi Viel 


(2.87) 


jeJ 




WjeJ 


(2.88) 


i€l 




Viel, 'ijeJ, 0<3,<1 Vie/ 


(2.89) 


XijfK, S,eZ Wiel.WjeJ 





Notice this use of the quantifier ‘V’, but here it is indices which are being quan- 
tified over the domains of their index sets. Such a notation is common in certain 
modelling languages as discussed in Chapter 3. The use of the Greek letter ‘3’ 
for 0-1 variables is common in contrast to the Roman alphabet for continuous 
variables. 

It is necessary to model this problem as a (mixed) IP because of the discrete 
nature of the decision to build, or not build, a warehouse at each location. The 
building of a fraction of a warehouse (resulting in an LP model) is not an option. 
The fi are fixed costs which will either be incurred in their entirety, or not at all. 
These are in contrast to the c,j which are variable costs which can be incurred in 
continuously varying fractions depending on the continuously varying values of the 
variables xtj . 

In (2.89) we have represented the variables S, as general integer variables which 
must lie in the range 0-1, so restricting them to the two values 0 and 1. It is more 
usual to specify them directly as binary variables (5,e{0, 1). 

This problem is an instance of the fixed charge problem, which is further dis- 
cussed in Chapter 3. Note that it is not necessary to explicitly model the condition 
that if a warehouse is opened (and therefore a fixed cost incurred) one will send a 
positive quantity out of it. This must happen by virtue of optimality for to open it 
and not send any quantity from it would be non-optimal. Hence this condition is 
guaranteed by the objective function. 



2.3.4 Logical Decisions 

Once we have 0-1 variables we can make logical statements about the correspond- 
ing decisions. Example 2.13 illustrates this. 

Example 2.13 Model the following extra conditions in the facility location problem. 



i. At most n warehouses can be built where n ^ | / 1 . 

ii. If any warehouses are built at ieP then no warehouses can be built at ke Q but 
at least three warehouses must be built at leL. 
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Condition (i) can be modelled by 



^ S,- < « (2.90) 

iel 

Condition (ii) can be modelled by 

^ieP, ^keQ (2.91) 

J2s,^3Si WieP (2.92) 

leL 

Many variants and extensions of the facility location problem can also be mod- 
elled. The problem typifies a whole class of problems involving discrete decisions 
which have to be made if other activities are to be carried out. 

Example 2.14 A blending problem has been formulated as an LP with variables x,- 
representing the proportions of ingredients i . In addition it is stipulated that 

i. If an ingredient i is included it must be at a proportion of at least a, . 

ii. At most n ingredients can be used. 

iii. If any ingredient ie/ is used then at least one ingredient jeJ must also be used. 

We introduce 0-1 variables representing the presence of ingredient i and the 
following constraints: 



Xi < (2.93) 

Xi ^ a,(5i (2.94) 

If Xi > 0 then (2.93) forces iS,- = 1. Then 3, = 1 forces x,- ^ a, by (2.94). 
Condition (ii) is similar to that in Example 2.13 and is modelled as 

(2.95) 



Condition (iii) is modelled as 

Wiel (2.96) 

J^J 

Again the above types of constraints may also arise in other contexts than 
blending. 

It is sometimes natural to model certain conditions as products of 0-1 variables. 
Of course such a product is itself a 0-1 quantity which can be represented as a 0-1 
variable with a logical dependence on the original variables. Example 2.15 illus- 
trates this. 
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2.3.5 Products of 0-1 Variables 

Example 2.15 A model contains the product of two 0-1 variables. Convert the 
model to a linear form. 

We replace this product with a new 0-1 variable y and the constraints 



3i + — y ^ 1 


(2.97) 


5i - y ^ 0 


(2.98) 


S 2 - y ^ 0 


(2.99) 



Equation (2.97) forces y to be 1 if and S 2 are both 1. Equations (2.98) and (2.99) 
force y to be 0 if either (5i or 52 (or both) take the value 0. 



2.3.6 Set-Covering, Packing and Partitioning Problems 

There is an important class of problems which give rise to pure 0-1 IP models known 
as set-covering, packing and partitioning models. We illustrate this by an example 
of a set-covering model. 

Example 2.16 Cover all the elements {1,2, 3, 4, 5} using the minimum number of 
subsets {1, 2,3,5}, |1,2,4,5}, {1,3,4}, {2,3,4,5}, {3,4,5}. 

We use 0-1 variables &j which equal 1 if and only if the yth subset is used. Each 
element of the original set gives rise to a constraint. In order to cover an element 
the sum of the variables representing the subsets containing the element must be at 
least 1 . The model then becomes 



Minimise 




+ ^2 -h 53 + ^4 + ^5 


(2.100) 


subject to 




Si -h S 2 -t S 3 ^ 1 


(2.101) 




Si + S 2 -t S 4 ^ 1 


(2.102) 




^1 + ^3 -b 54 + 5s ^ 1 


(2.103) 




S 2 + S 3 S 4 + Ss'^ 1 


(2.104) 




^1 + ^2 + i 54 + 5s ^ 1 
5i, S 2 , S 3 , S 4 , 556 ( 0 , 1} 


(2.105) 
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This type of model arises in a number of contexts. One of the most important is 
crew scheduling (particularly for airlines) where each subset corresponds to a roster 
which, if used, must be assigned to a possible crew. The objective is to minimise 
the number of crews necessary to cover all the flights. Each flight gives rise to a 
constraint. The model also arises in computer circuit design and logic simplification. 
This latter application is discussed in Chapter 4. 

It is worth pointing out that, in practice such models can have a huge number of 
variables. However, they are often amenable to simplification by preprocessing. For 
example S 5 is redundant since 54 covers everything that ^5 does and more at equal 
objective ‘cost’. The fifth constraint can then be removed as it is implied by the 
second. This reduces the model to 

Minimise 

^1 + ^2 + i53 + ^4 (2.106) 

subject to 



+ ^2 + ^3 ^ 1 


(2.107) 


^ 1 


(2.108) 


151+^3 + 54^ 1 


(2.109) 


52 + ^3 + ^4 ^ 1 
5j, ^ 2 , ^ 3 , ^4 C { 0 , 1 } 


( 2 . 110 ) 



Exercise 2.6.17 is to solve this model by branch-and-bound. 

Set-covering models look deceptively easy to solve but can be very difficult, 
usually on account of their size. They do have important mathematical properties. 
In particular there will always be an optimal integer solution which corresponds to a 
vertex of the polytope for the LP relaxation (but normally not the vertex associated 
with the LP optimum). The Chvatal rank for the class of set-covering models is not 
bounded. 

Closely related are set packing models where the objective is to ‘pack’ as many 
subsets into a given set, as possible, with no overlap. A particularly important case 
arises where the set of elements can be represented as the nodes of a graph. The sub- 
sets are pairs of nodes which are joined by an edge. The packing problem becomes 
that of finding the maximum ‘pairing’ of nodes where each node has a unique ‘mate’ 
(if it has one at all). This is known as the matching problem. Formulated as an IP 
it can be shown to be of Chvatal rank 1 . A special case of the matching problem is 
to ‘bipartite’ graphs. Here there are two sets of nodes and matchings can only be 
between nodes of different sets. This problem is usually known as the assignment 
problem. An optimal solution to the LP relaxation is integer. The Chvatal rank is 
therefore 0. A set partitioning model is that of partitioning the original elements into 
a collection of the given subsets. Packing and partitioning models can be converted 
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into one another. In all these types of models, with some applications, objectives 
with non-unit coefficients may be used. 



2.3. 7 Non-linear Problems 

Example 2.17 demonstrates how a certain type of non-linearity in an IP model 
may be linearised. Non-linear models in general may be remodelled, to a degree 
of approximation and an increase in size, as linear IP models. Hence the distinc- 
tion between linear and non-linear in IP is less important than in other branches of 
mathematics. What’s more the IP formulation has the advantage that global opti- 
mal solutions are obtained in contrast to local optimal solutions. This distinction is 
illustrated in Fig. 2.14. 



y 

A global optimum 




Fig. 2.14 Local and global optima 



If our objective were to maximise y then conventional optimising methods might 
find one of the local optima shown and not be able to improve on these. However, an 
IP formulation can be guaranteed to find a global optimum (at, possibly, consider- 
able computational cost). The next example illustrates how non-linear models may 
be formulated as linear IP models. 

Example 2.17 Make a piecewise linear approximation to the non-linear function 
y — — Ylx^ -h 23x -t- 86 for 0 ^ L ^ 39 and model the (approximate) relationship 

between x and y using an IP model. 

Such an expression (together with other non-linear expressions) may occur in the 
objective, the constraints or both. 

Figure 2.15 gives the graph oiy — x^ — YLx^ -h 23x -f 86. 

We evaluate the function at a number of grid points. These need not necessarily 
be evenly spaced, but for this example we evaluate the function at integers between 
X — Q and 9 giving Table 2.1. 




56 



2 Integer Programming 




Fig. 2.15 A piecewise linear approximation to a non-linear function 



Table 2.1 Values for a Non-linear Function 

X0123456789 
y 86 98 92 74 50 26 8 2 14 50 



Joining these points in Fig. 2.15 gives the piecewise linear approximation 
ABCDEFGHIJ to the curve for the function. In order to do this we introduce 
(non-negative continuous) variables Xq, ki, k 2 , >- 4 , kj, Xg which 

represent ‘weights’ given to the points A,B,C,D,E,F,G,H,I,J in representing the 
function by the following equations: 



X — OA-o -f lA-i -j- 2A-2 “1” 3A-3 -t- 4A.4 -F 5A.5 -F 6A.6 -F Ikj -F 8Ag -F 9kg (2. Ill) 

y = 86 Aq + 98A-1 + 92A.2 -F 74A.3 + 50A.4 + 26A.5 -F 8X5 -F 2A7 -F 14A.g + 50A.9 

( 2 . 112 ) 



A.Q, A.1, A.2, A.3, A-4, A.5, A.6, A.7, A,g, A.9 ^ 0 
A.Q, A-i, A-2, A.3, A-4, A.5, k(f, A.7, A.g, A.96M 



(2.113) 

(2.114) 
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In addition we need to impose the condition 



At most 2 adjacent 1, can be non-zero 



(2.115) 



If 1, and A,+i are the two adjacent non-zero A, ’s then (x, y) represents a point on 
the line between the points corresponding to the ith and (i + l)th grid points. For 
example if A .0 = 0, A.i = 0, X 2 — A .3 = A. 4 , . . . , Ag = 0then(x, y) = (2|, 78|) 

being the point P in Fig. 2.15. On the true curve if x = 2^ then y — 79^ being 
the point Q. The piecewise linear approximation has therefore underestimated the 
true value of y by Obviously the more rehned the grid the more accurate the 
approximation, but more A, variables will be needed. 

It remains to represent condition (2. 1 15) by constraints. In order to do this integer 
variables are needed. We need to model the logical condition that the possible non- 
zero pair is either Aq and Ai, or Ai and A2, or A3 and A4, etc. This can be done by 
0-1 variables S 2 > • ■ • > ^9 and the constraints 



Aq < Si 


(2.116) 


^ Si -h S2 


(2.117) 


■^2 ^ S2 4 - S3 


(2.118) 


kg ^ Sg -h S9 


(2.119) 


Ag ^ Sg 


( 2 . 120 ) 


1 + S5 -|- Se -}- S7 -}- Sg -|- Sg = 1 


( 2 . 121 ) 



Constraint (2.121) ensures that exactly one S, is 1, the others being 0. Then by 
constraints (2.116)-(2.120) at most two adjacent A, can be non-zero. 

In order to ‘linearise’ expressions such as that above in this way they must 
be functions of a single variable or be able to be expressed as the sum of 
non-linear functions of single variables. Such functions are known as ‘sepa- 
rable.’ The limitation of only being able to model separable functions is not 
as restrictive as might, at hrst, be thought. For most functions arising in prac- 
tice it is usually possible to ‘separate’ them by suitable transformations. For 
example the product xy of two continuous variables, although not separable, 
can be transformed into a separable model by the log transformation since 
logxy = logx -l-logy. 

We again discuss how non-linear problems give rise to IP models and the impor- 
tant distinction between convex and non-convex models in Chapter 3. 
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2.3.8 The Knapsack Problem 

This is one of the simplest types of pure IP model involving only one constraint and 
takes the form 

Maximise 

( 2 . 122 ) 

j 

subject to 

j 

Xj ^ 0 Vy 
xje'L V j 

Like some of the other IP models discussed in Sect. 2.4, under the title of ‘com- 
putational complexity’, this type of model is ‘difficult’ to solve but, in one sense, 
the easiest of such IP models. Extensions of the knapsack problem are the bounded 
variable knapsack problem, the 0-1 knapsack problem and the equality-constrained 
knapsack problem. The name arises from the largely fictitious application of trying 
to fill a knapsack of limited size with items of maximal total value (although it has 
been used to allocate the hardware for different experiments in space exploration 
vehicles). More commonly it is used as a subproblem for other combinatorial prob- 
lems (e.g. cutting stock and bandwidth allocation). 

The LP relaxation of the knapsack problem is trivial to solve (concentrate on 
the activity with the greatest ratio of objective coefficient/constraint coefficient) 
but the IP involves finding the best combination of items to fill the knapsack 
taking account of both this ratio and how well it fits (with other items) into the 
knapsack. 



(2.123) 

(2.124) 



2.3.9 The Travelling Salesman Problem 

We demonstrate this problem by the following example. 

Example 2.18 A routing problem. Find the shortest order to go around cities N — 
{1, 2, . . . , «}, returning to the beginning. The distance from city i j is given 
as dj. 

We will assume that the distance from i to j is not, necessarily, the same as the 
distance from j to i, i.e. we have an example of the asymmetric travelling salesman 
problem (TSP). 

We can model this problem as 
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Minimise 



'^CijXij (2.125) 

ij 

subject to 

= l WieN (2.126) 

J 

^Xy = l VjeN (2.127) 

i 

Xij < |5| - 1 V5 C such that 1 < |5| < n (2.128) 

iJeS 

Xij€{0, 1} V/, jeN 

The 0-1 variables Xij indicate whether a direct link i —>■ j is, or is not, on the 
tour. Constraint (2. 126) forces each city to be left exactly once and constraint (2. 127) 
forces each city to be entered exactly once. Constraint (2.128) rules out subtours. 
Figure 2.16 demonstrates subtours which would be ruled out by constraints (2.128). 
Note that there are 2" — n — 1 subtour elimination constraints, i.e. an exponential 
number as a function of m. In practice these constraints are only added on an ‘as 
needed basis’, or known facet constraints are added when they cut off fractional 
solutions to the LP relaxation. 




There are other formulations of the TSP, as an IP, with a polynomial number of 
constraints although they have weaker LP relaxations (with one exception). Refer- 
ences are given in Sect. 2.5. 
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2.3.10 Other Problems 

In addition to the applications of 0-1 IP discussed above there are a number of other 
applications of 0-1 IP not discussed here but which can be followed up through the 
references. 

Other problems involve networks and graphs (with nodes and arcs) and are nat- 
urally modelled with 0-1 variables representing the presence of particular nodes 
and the resulting presence of arcs emanating from them. The quadratic assignment 
problem is an extension of the assignment problem mentioned above (which can be 
treated as an LP) but is naturally modelled with products of 0-1 variables which can 
be linearised (with a large increase in model size) in the way shown in Example 2. 15 
making it a genuine IP. The matching problem that was mentioned above can also be 
regarded as an extension of the assignment problem. The matching problem, itself, 
has a number of extensions. 

Scheduling problems involve deciding in what order to process jobs on machines. 
The 0-1 variables can be used to decide, for each pair of competing jobs, which one 
is processed first. 

Capital investment problems (in addition to facilities location, already discussed) 
can incorporate capital limitation constraints and be modelled over periods of time 
with resultant cash flows. 



2.4 Computational Complexity 
2.4.1 Problem Classes and Instances 

It is clear that problems in some classes are more difficult to solve than others. For 
example LPs are relatively ‘easy’ to solve. Models with thousands of constraints 
and variables are now solved routinely on desktop computers. Models with millions 
of constraints and variables have also been successfully solved. For IP the picture 
is very different. While some large models have been successfully solved some 
models, with only hundreds of variables, can prove very difficult. In logic large 
satisfiability problems (which are discussed in Chapter 4) can prove very difficult to 
solve. The subject of computational complexity is concerned with trying to classify 
problems into the easy and difficult categories. According to the method which we 
will describe LP is easy but IP, and many of its special cases, and the satisfiability 
problem are difficult. 

The method which we describe looks at classes of problems rather than instances. 
Within a particular ‘difficult’ class there may be easy instances. The categorisation 
of classes will be based on the worst cases within that class. In some ways this is 
unsatisfactory since in practice the worst cases may seldom arise. However if we 
base the categorisation on worst cases then we can give a ‘guarantee’ of difficulty. 
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Average case analysis is much more difficult and requires probability distributions 
to be defined in order to define what the average cases are. 

It is important to recognise that the method uses categorises problem classes 
rather than algorithms. There may be ‘good’ and ‘bad’ algorithms for solving prob- 
lems in any class. For a problem class to be designated easy it is necessary to show 
that there is at least one easy algorithm for solving all problems in the class. A 
problem class will be designated ‘easy’ if there is an algorithm for solving problems 
in it whose number of steps will never be more than a polynomial function of the 
size of the problem (storage spaces required in a computer). On the other hand a 
problem class will be designated ‘difficult’ if there is no known such algorithm and 
the number of steps in all the algorithms known requires an exponential number of 
steps. These definitions will be made more precise later. A rather negative aspect 
of the categorisation, in terms of mirroring reality, is that some successful algo- 
rithms for problems in the easy class are exponential (in the worst case). The classic 
example is the simplex algorithm for LP. Examples can be found (see references) 
where the simplex algorithm takes an exponential number of steps although in prac- 
tice (‘average cases’) it is known to solve huge models in reasonable periods of 
time. However there are known to be polynomial algorithms for LP (one of which 
sometimes overtakes the Simplex on ‘huge’ problems). For IP and the satisfiability 
problem no algorithms have ever been found which are not exponential or worse. 



2.4.2 Computer Architectures and Data Structures 

Another aim of this problem characterisation is to make it, as far as possible, inde- 
pendent of the computer architecture used. In order to do this fairly crude assump- 
tions are made but it turns out that they often do not matter in making the (major) 
distinction between polynomial and exponential algorithms. One of the assump- 
tions is that ‘elementary’ operations such as addition, multiplication, comparison, 
etc. all take the same amount of time on ‘small’ numbers. Of course multiplication 
is inherently more difficult than addition, although by how much depends on the 
architecture of the computer used. 

Also of great significance, in practice, is the data structure used for encoding 
instances of the problem. We must, however, restrict ourselves to ‘sensible’ data 
structures. We would not encode a number « by a sequence of n I’s. We would, 
rather, encode it using binary notation taking account of the significance of the 
position of each digit as well as its value. For example 23 would be written as 
10111 (since 23 = lx2‘^-|-0x2^-|-lx2^-l-lx2'-|-lx 2°) instead of as a 
string of 23 I’s. It is useful (and realistic) to use only two symbols 0 and 1 given 
the binary hardware devices used in computers, although representing numbers to a 
higher radix would not alter the final analysis. However, what is important is to use 
some such representation. Clearly the storage required for a number n will be log 2 n 
bits (with possibly another bit to represent the sign of n), as opposed to n bits. 
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2.4.3 Polynomial and Exponential Algorithms 

If the storage required for a problem instance is N bits then we will represent the 
number of elementary operations needed to solve it by f(N). If f(N) ^ cN’’ for 
some algorithm and for all instances in the class, for a particular c and a particular r 
then the algorithm is said to polynomially bounded and the class of problems is said 
to belong to category P (polynomially solvable). We seek the algorithm with small- 
est value of r for which this inequality holds. This definition has certain limitations. 
First we do not normally take account of the magnitude of c. This means that, in 
practice, a very large number of operations might be needed but the growth in the 
number of operations needed, as problem sizes increase, will be less than for non- 
polynomially bounded classes. This measure is an asymptotic one, measuring rate 
of computational growth rather than computational difficulty for particular problem 
instances. One of the reasons for paying little attention to the value of c (and, to a 
lesser extent, r) is that these values can be altered by different implementations of 
an algorithm and the data structures used. 

If the rate of growth of an algorithm cannot be shown to be polynomially bounded 
then we could look at gradually higher rates of growth, e.g. , ^At'°g^. 

However, we do not usually examine these possible bounds on growth but look at 
exponential growth rates. If no algorithm, for a particular problem class, can be 
found which is polynomially bounded we see if we can find a bound of the form 
f{N) ^ ca^ . The number of computational steps is then said to be exponentially 
bounded. This does not mean that there does not exist a polynomially bounded 
algorithm for problems in the class but, rather, that one has not been found. There 
is, however, strong experimental and theoretical evidence that for many of the ‘dif- 
ficult’ problems (e.g. IP and many special cases) we consider that no polynomially 
bounded algorithm can exist. Before we discuss why this is so it is worth illustrating 
the dramatic difference between polynomial and exponential growth by comparing 
the growth in operations with that of 2'' operations. This is done in Table 2.2 
where we assume each elementary computer operation takes 1 ms. Then the com- 
putational times are those given. 



Table 2.2 Exponential Versus Polynomial Growth 



r 


t-2 


T 


1 


0.000001 s 


0.000002 s 


10 


0.0001 s 


0.001 s 


20 


0.0004 s 


1 s 


30 


0.0009 s 


17.9 min 


40 


0.0016 s 


12.7 days 


50 


0.0025 s 


35.7 years 


60 


0.0036 s 


366 centuries 



It is worth noting, at this point, why the encoding of the problem data was impor- 
tant. Suppose we had not used a binary encoding (or an encoding to some other 
radix), but simply a string of digits of length equal to the size of the data. Instead of 
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being of size N the data would have been of size M — 2^ (since log2(2^) = N). 
Then an exponential function of N, would become i.e. a polynomial 

function of M and the distinction between exponential and polynomial growth 
would be lost. 

If, however, we were to encode to a different radix than 2 then this would simply 
alter the value of c for a polynomial growth rate and the value of a for an exponential 
growth rate. It would not, therefore, alter the distinction between polynomial and 
exponential growth rates. 

We should also explain that, in this discussion, we are restricting ourselves to data 
and solutions that are integer or rational. Therefore these can be represented exactly 
in a computer and we do not have to bring in concepts of solution to some degree of 
approximation. Of course it is possible to represent any ‘computable’ number (i.e. 
any we might wish to talk about such as, for example, \/2 or tt) in a computer by 
an exact ‘formula’ (computer procedure) but such issues of a uniform method of 
representability are beyond the scope of this book. We restrict ourselves to the usual 
(binary) encoding of numbers to successive multiples of powers to a given radix. If 
necessary rational numbers could be represented exactly by storing their numerator 
and denominator (reduced to be coprime), it is easy to show that the size of the 
denominator is bounded by the maximum determinant of a basis matrix in LP. 

It is necessary to distinguish between ‘small’ and ‘large’ numbers in discussing 
the encoding. While each ‘number’ in some problems (e.g. the satisfiability problem 
and some graph problems) can be represented as a small number and elementary 
operations on such numbers assumed to take unit time, this is unrealistic for other 
problems. For example LP and IP problems may contain arbitrarily large coefficients 
and the time to carry out operations on them will be related to (the logarithm of) their 
size. Therefore, for example, we could take the size of the encoding of an LP or IP 
to be bounded by (m + I) x (« + I) x logmax,y(| |, Ic^- 1, \bi |), where m, n, A, c, b 
represent the number of constraints, variables, matrix of coefficients and objective 
and right-hand-side vectors, respectively. Other ways of measuring the size of the 
encodings (e.g. log(X,y(A,y -|- Cj -|- fe, )) will make no difference to whether they be 
classified as polynomial or exponential. With any of these encodings LP becomes 
polynomially bounded (by the Ellipsoid and Karmarkar algorithms referenced in 
Sect. 2.5). However no polynomial algorithm has ever been found (or seems likely 
to be found) which is a function of m and n alone, i.e. which ignores the magnitude 
of the coefficients. In contrast there are polynomial algorithms for solving simul- 
taneous equations (e.g. by Gaussian elimination) which are functions of m and n 
alone. Another reason suggesting why it is usually necessary to take account of 
the magnitude of the coefficients is that it is, in principle, possible to reduce many 
problems to ‘simpler’ problems with larger coefficients. Any pure IP model, with all 
equality constraints, can be converted to a knapsack model by aggregating the con- 
straints (but producing very large coefficients in consequence). In one sense the 
knapsack problem is ‘easier’ than general pure IP (although both are difficult in 
the classification given here). It would therefore be unrealistic to assume the size of 
the coefficients in a knapsack problem do not matter. The celebrated work of Godel, 
mentioned in Chapter 1 , was carried out (again in principle) by encoding proofs of 
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theorems uniquely as numbers and then defining relations between these numbers. 
If carried out in practice these ‘Godel numbers’ would be astronomic. But they do 
demonstrate that any ‘problem’ can be represented by a number and the problem 
of solving it reduced to a computation on that number. The size of this number (a 
‘super-exponential’ function of the numbers in the original problem) must, obvi- 
ously, be realistically taken into account when discussing the complexity of solving 
the original problem. (What Godel did was to create a number ‘G’ (in principle) 
which represented the formalised statement (in the meta system) ‘The statement 
with Godel number G is not provable’. Hence if G represented a ‘true’ statement (in 
the formalisation of the meta system) then it could not be proved within the system 
and vice versa. This also demonstrates that there are problems for which there is no 
bound on the number of computational steps needed. They are ‘unsolvable’. 



2.4.4 N on-deterministic Algorithms and Polynomial Reducibility 

In order to analyse the computational complexity of the difficult problems which 
we consider in this book we note that the (optimal) solutions to many of them 
could be checked in polynomial time if we had a way of checking all the candidate 
solutions in parallel. This is a theoretical, but practically unrealisable notion. We 
call a ‘machine’ which can carry out such a check of all possible cases in paral- 
lel a non-deterministic computer. If each check is polynomially bounded then the 
algorithm is said to be non-deterministically polynomially bounded. If there exists 
such an algorithm for a class of problems the class is said to belong to the category 
NP. It turns out that most of the problems which we consider in this book belong 
to NP. Of course problems that are in P are also in NP since a problem that can 
solved in polynomial time can trivially have the solution checked in polynomial 
time (without even the need for a non-deterministic machine). However, to show 
that a problem class is not polynomially bounded we need to show that it is in NP 
but not in P. It has never been proved that there are such classes, i.e. that P ^ NP. 
However, there is strong evidence to suggest it is true. We therefore seek to find 
the ‘most difficult’ problem classes in NP. The problems here are referred to as 
‘NP complete’ and the category as NPC. It turns out that all the known ‘difficult’ 
problem classes, which we are considering here, e.g. IP and various applications 
such as the TSP as well as the satisfiability problem, are ’polynomially reducible’ 
to each other. This means that if we take a problem Pi in a ‘difficult’ category Ci, 
which is encoded by a string A^i , then we can transform it in a number of steps g{N\), 
which is polynomially bounded, into a (polynomial) encoding N2 of a problem P2 
in a class C2. If there is no polynomially bounded algorithm for the problems in 
Cl then there can be no polynomially bounded algorithm for the problems in C2. 
Otherwise we could use such an algorithm to construct a polynomially bounded 
algorithm for class Ci. In Chapter 4 we discuss the satisfiability problem and show 
that it can naturally be converted into a set-covering problem (a special case of IP). 
The transformation is immediate and, therefore trivially polynomial. Therefore if 
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there is no polynomially bounded algorithm for the satisfiability problem (and none 
has ever been found) there can be none for the set-covering problem or general IP. 
It should be pointed out that such polynomial reductions can be quite ‘radical’ as 
their only real purpose is theoretical. The reduced problems may bear little practical 
resemblance to the problems from which they are reduced. Historically the satisfia- 
bility problem was taken as class Ci. No polynomially bounded algorithm has ever 
been found for all problems in this class (although there are polynomially bounded 
algorithms for specialised cases). It can be transformed into other difficult problem 
classes and vice versa. Once we have a polynomial transformations between this 
and other problem classes all these classes can be put in NP and any transformation 
between another problem class and any class in NP places the new class in NP. It 
might be thought that the reformulation of IP models, in terms of their convex hull, 
makes them polynomially solvable, as they become LPs. But, in general this will 
result in an exponential number of constraints (as a function of the problem size). 
Also the amount of calculation needed to produce facet-defining constraints may 
not be a polynomial function of the problem size. 

There are also ‘more difficult’ problems than those in NP, i.e. problems where 
there is no known non-deterministic polynomial-checking algorithm. Such prob- 
lems have been named NP hard, i.e. they belong to the NPC class or are ‘harder’ 
than any problems in this class. While NPC problems may be polynomially 
reducible to such problems the reverse will not be the case. It is rather unfortunate 
that NP complete problems are sometimes referred to as ‘intractable’. This is falsely 
taken, by some people, to imply that there is no point in seeking exact algorithms 
for them. Instead (polynomially bounded) heuristics must be sought to produce 
approximate solutions (possibly within some error bound). This is false. There are 
many problems belonging to ‘difficult’ classes which are reasonably easily solved 
for most practical instances. The existence of (sometimes pathological) worst cases 
which may take a long time to solve should not deter one from seeking exact 
solutions in practice. 



2.4.5 Feasibility Versus Optimisation Problems 

Some of the problems that we have discussed are optimisation problems (e.g. LP, IP 
and the many special cases) and some are feasibility problems (e.g. the satisfiabil- 
ity problem and certain decidability problems). It is usual, when doing complexity 
classifications, to regard all problems as feasibility problems. Suppose, for exam- 
ple, we have an IP model, which is a maximisation, and we know lower (/) and 
upper (m) bounds on the optimal objective value. We can test if zo is the optimal 
objective value by checking if the constraint object i ve ^ zo, when appended to the 
constraints, keeps the model feasible but the constraint objective ^ Zo + 1 makes 
it infeasible. The value of zo can be found, in a polynomial number of steps, by 
dissecting u — I to test zi = (m — l)/2 (or nearest integer) for feasibility. If feasible 
the new interval for dissection is zito u. If infeasible the new interval is / to zi- 
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Proceeding in this way the optimal solution can be found with log 2 (w — /) feasibility 
tests. Hence polynomial tests for optimality imply polynomial tests for feasibility 
and vice versa. 



2.4.6 Other Complexity Concepts 

Our categorisation has been based on the time to carry out computer operations. 
There is also interest in the space required, e.g. what is the maximum space require- 
ment as a function of the size of the input data? We will not cover such an analysis 
here. However, time and space complexity are connected since if we require an 
exponential number of storage positions we will require an exponential time to read 
them. However, the reverse need not be the case. We may require an exponential 
time but be able to carry out the calculation in a polynomial space. 

Another indicator of the ‘easiness’ of some problems is an easy check for opti- 
mality. For example, for LP, most algorithms provide the optimal dual solution 
simultaneously with the primal solution. Primal and dual feasibility (or, alterna- 
tively, identical objective values) prove optimality. For IP the analogous method 
would be to use the optimal Chvatal function, as discussed in Sect. 2.2. But this 
would be, at least, exponentially difficult to calculate. 

As mentioned above there are truly ‘intractable’ problems, i.e. those which are 
unsolvable as discussed in Chapter 1, e.g. undecidable problem classes such as 
establishing the truth or otherwise of all statements in formalised arithmetic. How- 
ever, some ‘less ambitious’ problem classes are also unsolvable, e.g. general IP with 
quadratic constraints and objective. 

A number of other observations about the complexity of the problems discussed 
in this book are in order. It has already been remarked in Sect. 2 that ‘easy’ problems 
(e.g. LP and matching) appear to have a low Chvatal rank (0 or 1) and ‘difficult’ 
problems (IP, set covering, TSP, satishability) unbounded Chvatal rank. Also poly- 
nomially bounded classes usually require only a low-degree polynomial algorithm. 
For LP Karmarkar’s algorithm is of degree 7 in the data m,n, A. Matching is of 
degree 3 in n (the number of nodes). This may be simply a reflection of the com- 
prehensibility of the known algorithms rather than their potential existence. But it 
does suggest the possibility of high-order polynomially bounded algorithms for the 
‘difficult’ (thought to be NP complete) problems. If this were the case the ediface 
would collapse and we have P = NP. 



2.5 References and Further Work 

The original text on LP is Dantzig [30]. It is still a major valuable text. Chvatal [24] 
and Martin [79] are also excellent texts together with many more. The Fourier- 
Motzkin method for LP is due to Fourier [38] and is discussed in Williams [114] 
and Martin [79]. The original statement of the simplex ‘algorithm’ cannot be guar- 
anteed to terminate. The smallest possible example of a non-terminating instance 
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is given by Hall and McKinnon [51]. There are a number of ways of modifying 
the simplex algorithm to guarantee termination. The simplest such method is due 
to Bland [15]. Even when termination is guaranteed the simplex algorithm may 
take an exponential number of steps as is shown by examples due to Klee and 
Minty [69] and Jeroslow [61]. It is still possible that the simplex algorithm could 
be polynomially bounded by a new choice of ‘pivoting rule’ . Average case analysis 
of the simplex algorithm has been done by Smale [102]. The largest LP reported 
(arising in financial planning) and solved has 353 million constraints and 1010 
million variables and is due to Gondzio and Grothey [46] . 

Nemhauser and Wolsey [86] is the standard text on IP. The extension of Fourier- 
Motzkin elimination to IP is described by Williams [111]. Schrijver [98] is a major 
theoretical text on LP and IP. Williams [110, 113, 119] gives formulations of many 
types of LP and IP models. Appa et al. [3] edit a survey of applications of IP 
ranging from vehicle routing, the design of telecom networks, supply chain design 
and operation, pharmaceutical manufacture, oilfield infrastructure design and plan- 
ning, radiation treatment for cancer, multitarget tracking and molecular biology. The 
application of the branch-and-bound algorithm to IP is due to Land and Doig [71] 
and Dakin [29]. The formalisation and naming of the concepts of relaxation and 
fathoming is due to Geoffrion and Marsten [43]. It is possible to ensure that the 
branch-and-bound algorithm terminates by placing bounds on the values of the inte- 
ger variables derived from vertex or extreme ray solutions of the LP relaxation (so 
long as the data are rational). 

The concept of a Chvatal function and the fact that it closes the duality gap for a 
pure IP is due to Chvatal [23]. 

Gomory [45] gives the first method of solving PIP models by adding cuts result- 
ing from solutions by the simplex algorithm. Marchand and Wolsey [76] discuss 
the generation of cuts for MIP models (which we do not discuss here). Tind and 
Wolsey [104] give a survey of IP duality. Balas [8] also discusses it. Williams [118] 
surveys the concept of duality for mathematics in general. 

The set covering, packing and partitioning problems have been extensively stud- 
ied by, for example, Balas and Ng [10], Cornuejols and Sassano [27] and Balas and 
Padberg [11]. Edmonds and Johnson [35] describe the matching problem, its exten- 
sions and complexity. It is also discussed by Pulleyblank [92]. Ryan [97] applies set- 
partitioning models to very large aircrew scheduling problems. The practical solving 
of non-linear models by IP is discussed by, among others, Beale and Tomlin [13]. 
The knapsack problem is surveyed in Martello and Toth [77]. Facets of the 0-1 
knapsack problem are given by Balas [6], Wolsey [125] and Hammer et al. [52]. The 
TSP has been extensively studied. A standard edited volume is Lawler et al. [74]. 
Orman and Williams [87] give and compare eight different formulations of the TSP 
as IPs. Applegate et al. [4] give a comprehensive computational study of the TSP. 

Computational complexity is discussed in Aho et al. [1]. 

The concept of NP completeness is due to Cook [26] and Karp [67]. Garey 
and Johnson [41] is the standard text on the subject. Polynomial algorithms for 
LP are given by Khachian [68] and Karmarkar [66]. Depending on the exact 
implementation, Khachian’ s ellipsoid algorithm is bounded by a polynomial 
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of degree (m* log 2 max(A,y, c;, fo,)) and Karmarkar’s algorithm is of degree 
(n^ log 2 max(A ,2 , cj, bi)). Jeroslow [60] shows that IP with quadratic constraints is 
unsolvahle. Fourier-Motzkin elimination is of complexity . The aggregation of 
pure IP models with equality constraints into knapsack models is due to Bradley 
[20]. A popular discussion of the creation of Godel numbers is contained in Nagel 
and Newman [85]. 



2.6 Exercises 



2.6.1 An LP can be regarded as an inference problem of finding the strongest 
inequality on the objective which is implied by the constraints. Express this problem 
in the predicate calculus applying it to Example 2. 1 . 

2.6.2 Convert the predicate calculus representation in Exercise 2.6.1 to that of 
Example 2.3 using the rules of the predicate calculus. 

2.6.3 Convert the dual model (2.23)-(2.27) to a maximisation subject to con- 
straints. Create the dual of this model and show it is the same as that in Example 2. 1 . 



2.6.4 Represent the dual model (2.23)-(2.27) geometrically. 

2.6.5 Apply the method of Example 2.3 to ‘solve’ Example 2.5 with objective 
(2.36). 

2.6.6 Create the dual of Example 2.6 and represent it geometrically to show that is 
unbounded. 

2.6.7 Show that 
Maximise 



subject to 



Xl X2 



Xl — X2 ^ — 1 

—Xl -I- X2 < -1 

Xi,X2P 0 



Xl, JC26 R 



and its dual are both infeasible. 

2.6.8 Develop nodes 2 and 7 of Example 2.8 and show that they do not produce 
integer solutions. 

2.6.9 Solve Example 2.8 by branch-and-bound, branching on X 2 at the top node. 
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2.6.10 Show that the following class of IP models will take an exponential (as a 
function of n) number of nodes to ‘solve’ 

Maximise 



subject to 



Xi + X2 H h x„ 



X\,X2,--- ,X„ ^0 

2.x \ -f 2x2 * * * "h 2x^ = 3 

Xl, X2, • • • , Xn ^0 

xi, X 2 , . . . , x„eZ 



2.6.11 Use the method illustrated in Example 2.7 to demonstrate that Example 2.9 
has no feasible (integer) solution. 

2.6.12 Show by branch-and-bound that statement (1.83) is false. 

2.6.13 Answer the following questions. If an answer is negative give a counter 
example. 

i. If an IP is solvable (not infeasible or unbounded) is the corresponding LP relax- 
ation solvable? 

ii. If an IP is infeasible is the corresponding LP relaxation infeasible? 
hi. If an IP is unbounded is the corresponding LP relaxation unbounded? 
iv. If an LP relaxation of an IP is infeasible is the IP infeasible? 

V. If an LP relaxation of an IP is unbounded is the IP unbounded? 

2.6.14 Derive the optimal objective bound (i.e. optimal objective value) for the IP 
Example 2.7 as a combination of rounding and linear combinations of the original 
constraints. Give the Chvatal rank for the model. 

2.6.15 By means of the class of models 
Maximise 



subject to 






y — ax ^0 
y + ax ^ a 
y ^ 0,x, yeZ 



show that the class of two-variable pure IP models is of unbounded rank. 
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2 . 6.16 If a pure IP model with all equality constraints can be aggregated into an 
equality-constrained knapsack model why is the same not necessarily true for pure 
IP models with inequality constraints and a knapsack model with an inequality con- 
straint? 

2 . 6.17 Solve the model from Example 2.16 by branch-and-bound. 

2 . 6.18 Show, using the decision procedure described in Example 1.8, that the LP 
relaxation of Example 1.9 is true. 

2 . 6.19 Represent Example 1.8 geometrically. 




Chapter 3 

Modelling in Logic for Integer Programming 



3.1 Logic Connectives and IP Constraints 

We can regard an LP, or IP, constraint as an atomic proposition. 

LP is concerned with optimising over a conjunction of such constraints. If, however, 
we wish to allow a more general logical statement about such constraints we require 
an IP model. 

In order to systematise the modelling of such conditions it is convenient to intro- 
duce 0-1 integer variables to represent the satisfaction, or otherwise, of such con- 
straints. When this is done it is usually necessary to know the upper or lower bounds 
for the left-hand side of the constraint above, i.e. 

771 / ^ '^QijXj < Mi (3.1) 

j 



Such bounds can usually be derived from knowledge of the problem or known 
bounds on the variables. It is important for the bounds not to place unwarranted 
restrictions, but computationally desirable for them to be as ‘tight’ as possible. In 
fact the derivation of as small ‘big M’ (or large ‘little m') values as possible is 
of central importance, as this affects the strength of the LP relaxation, and conse- 
quently the amount of computation in the subsequent IP optimisation by, for exam- 
ple, branch-and-bound. This aspect is discussed in Sect. 3.3. If it is desired to get the 
tightest values of M and m possible then one could, of course, resort to the, possibly 
computationally costly, approach of maximising and minimising aijXj subject 
to the LP relaxations of the other constraints in the problem to obtain M and m, 
respectively. In the ‘convex’ method of formulation, described in Sect. 3.4, it is 
possible to dispense with M and m altogether resulting in the LP relaxation yielding 
integer solutions. 

If it is impossible to place either one or both bounds on the quantity then the 
condition may be MIP unrepresentable as discussed in Sect. 3.2. In order to describe 
the necessary modelling we use the following example. 

Example 3. 1 Represent the following conditions by (mixed) IP constraints involving 
the variables in the constraints and the 0-1 variables 3, : 

H.P. Williams, Logic and Integer Programming, International Series in Operations 7 1 

Research & Management Science 130, DOI 10.1007/978-0-387-92280-5-3, 

© Springer Science-rBusiness Media, LLC 2009 
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11 . 



111 . 



'^UijXj bi -> 3,- = 1 (3.2) 



= 1 



E' 



< bi 



(3.3) 



'^ttijXj < bi < — ^ &i = 1 (3.4) 

j 

For illustration we are only considering the inequality It is straightforward 
to extend to and ‘=’ by converting these (in) equalities to 

A convenient way of viewing condition (i) is by the contrapositive statement 
discussed in Sect. 1.3. The contrapositive of (3.2) is 



S; = 0 



E 






< bi 



which, in turn, can be written as 

Si = 0 ^ ^Oijxj > bi 

j 



(3.5) 



(3.6) 



We do not normally use the strict inequalities ‘<’ and ‘>’ in LP and IP. This 
is because we wish to find values for variables which give maxima and minima, 
which may lie on a boundary. But for strict inequalities we may only be able to 
give suprema (least upper bound) or infima (greatest lower bound). However, we 
can write, to a degree of approximation, (3.6) (using for convenience) as 



5 ,- =0 



'^aijXj < -bi - s 
j 



(3.7) 



where e is a suitable small number. If the left-hand side of the inequality is all integer 
s may be taken as 1 (and there be no approximation). 

We wish to impose the condition that if S, = 0 then — aijXj + bi + s ^0. 
The largest that the left-hand side of this expression can ever be is —m + b + s. 
Hence we may write 
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- ^ UijXj +bi + e ^ i-rrii + bi + e)&i (3.8) 

J 

Equation (3.8) is more conveniently written as 

aijXj - (nii - bi - s)8i ^ bi -\- e (3.9) 

j 

Therefore condition (i) above has been written as an IP constraint. 

Equation (3.3) can be written, using analogous reasoning to that for (i), as 

atjXj + {Mi - bi )&i < Mi (3.10) 

j 

This gives condition (ii) as an IP constraint. 

Finally condition (hi) is modelled by taking both constraints (3.9) and (3.10) 
together. 

If any of (3.2), (3.3) or (3.4) involve a conjunction of the original constraints then 
the single 0-1 variable 8i can be used to represent all of these constraints. 

It is sometimes unnecessary to model the full equivalence condition (hi) between 
the satisfaction of a constraint and the value of a 0-1 variable since one of the 
implications may be guaranteed by optimality. For example this happens with the 
facility location problem discussed in Sect. 2.3. A facility (e.g. warehouse) will not 
be opened if it is not used since this would be a non-optimal thing to do. Hence the 
condition = 0 ^ 3, = 0 need not be modelled explicitly. 

We are now in a position to model all the connectives of the propositional 
calculus, applied to FP or IP constraints, by first representing the constraints by 
0-1 ‘indicator’ variables and S 2 and modelling one of the conditions above. In 
many situations we may be able to suffice with modelling only condition (i) or 
condition (ii), but this will depend on the particular problem. 

Because of the results in Sect. 1 .3, we can suffice with modelling only a complete 
set of connectives. However, for convenience, we will model all the most commonly 
used connectives (Table 3.1). 



V 



Table 3.1 Connectives as 0-1 Constraints 



+ ^2 ^ 1 



-j- ^2 — 3 — 0 ^1 ^ ^2 *^1 — 82 



For the ‘v’ connective if we have a disjunction of just two constraints then we 
could suffice with one 0-1 variable S setting it to 0 to imply (or be implied by) 
one constraint and 1 for the other constraint. But we have used two 0-1 variables 
to demonstrate the generalisation to a disjunction of any number of constraints. 
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However if, in general, we have the disjunction of r constraints (or disjunction of 
r conjunctions of constraints) we could suffice with [log2 r] (where ‘ [] ’ represents 
the integer round-up operation), 0-1 variables since there will then be at least r 
settings of these variables. This is the subject of Exercise 3.7.1. 

Example 3.2 shows how we may use these modelling devices. 

Example 3.2 Model the following condition between LP constraints: 



/2xi 4- X2 < 6\ / xi -h 3x2 < 9 \ 

V -L2 < 4 J \ 7xi -t- X2 < 20 y 

Xi, X2 ^ 0 
xi, X2eM 



• 2xi + 2x2 ^ 9 



(3.11) 

(3.12) 



It can be seen from the constraints above that 0 ^ xi ^ 3, 0 ^ X2 ^ 4. Hence 
(numbering the five constraints in (3.11) as 1, 2, 3, 4, 5) we can give lower and 
upper bounds on their left-hand sides of mi — m 2 — m^ — mu — — Q, 

Ml — 10, M 2 = 4, M3 = 15, M4 = 25, M5 = 14. It can be verified that these 
bounds apply. Eurther analysis would allow us to tighten these bounds to com- 
putationally beneficial effect. However we use these, easily obtained, bounds for 
illustration. This allows us to link 0-1 variables S, to the constraints. We only need 
model condition (ii) for each of them since we only need to make an implication in 
one direction, for each constraint. This gives 



2xi X2 4(5 1 ^10 


(3.13) 


X2 < 4 


(3.14) 


xi + 3x2 + 6(52 ^ 15 


(3.15) 


7xi -|- X2 -|- 5(52 ^ 25 


(3.16) 


2xi 2x2 "b 563 ^ 14 


(3.17) 


(Note that it is unnecessary to use variable to imply the second constraint since 
it applies overall.) 

It can be verified that setting a particular S, to 1 forces the corresponding con- 
straints to hold. The logical relations between the constraints in (3.11) can then be 


written as 




[((5i = 1) V (82 = 1)] ■ (53 = 1 


(3.18) 



The logical condition can now be modelled by 



Si + 82 ^ 1 
82 = 1 



(3.19) 
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If we have an objective then we have a MIP model. We solve this below. 

Alternative and better ways of formulating such logical problems (including this 
example) are described in Sect. 3.3. 

3.2 Disjunctive Programmin g 

The logical connective that distinguishes IP from LP is ‘v’. It has been seen that 
simply transforms inequalities to other (strict) inequalities that are usually modelled 
by approximating a non-strict inequality. The ‘ ’ connective is implicitly present 
in all LP models. If one negates an ‘=’ constraint (possibly first writing it as a 
conjunction of a and a constraint) one also obtains a disjunction. A dis- 
junction is what requires the use of IP rather than LP. Hence the phrase ‘disjunctive 
programming’ is sometimes used to characterise this type of modelling. 

Although we will be showing, in this chapter, how to model disjunctive pro- 
grammes as IPs it would be possible to solve them (inefficiently) using the method 
described (for the theory of dense linear order) in Example 2.3. This is the subject 
of Exercises 3.7.9 and 3.7.10. 

3.2.1 A Geometrical Representation 

It is helpful to illustrate such models geometrically. We do this by representing 
Example 3.2 geometrically, in Fig. 3.1. 

The boundaries of all the inequalities in (3. 1 1) are marked and the feasible region, 
defined by the logical statement of inequalities, is OABCDEFGH. Note that this is 
really a Venn diagram (discussed in Sect. 1.3) as a result of the conjunctions and 
disjunctions. It is non-convex, unlike LP feasible regions that are always convex. A 
convex region is one in which any line between two points in the region always, 
itself, lies in the region. In Figs. 3.2 and 3.3 we illustrate the distinction. 

Non-convex regions arise in a number of contexts. In non-linear programming the 
feasible regions may be non-convex and IP used, possibly approximating the feasi- 
ble region by piecewise linear segments. IP has the advantage that it finds global, 
as opposed to local, optima as discussed in Sect. 2.3. If a region is non-convex, 
and approximated to in this way, then the convex region can be partitioned into the 
union of convex sets. Each convex set can be represented as the conjunction of LP 
constraints. Hence the overall region is a disjunction of these conjunctions giving 
us the constraints of a disjunctive programme. Therefore there is a close connection 
between IP and (non-convex) non-linear programming. 

There is also a distinction to be made between convex and non-convex objectives. 
If the region above each contour, representing equal values of the objective function, 
is a convex set then the objective function is said to be convex. Minimising a convex 
function over a convex region (or equivalently maximising a concave function over 
a convex region) guarantees a global optimum from the LP relaxation. We therefore 
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Fig. 3.3 A non-convex region 

do not need to stipulate, and model condition (2.115) given in Chapter 2, or use 
integer variables. Such non-linear convex models arise in a number of contexts, 
e.g. when we have decreasing returns to scale. 

In order to illustrate disjunctive programming further we solve Example 3.2 with 
an objective. 

Example 3.3 Given the constraints in Example 3.2 
Maximise 



3xi -f 2x2 (3.20) 

Eor convenience we restate the MIP constraints we created 

2x\ X 2 ^ 10 (3.21) 

X2^4 (3.22) 

xi + 3x2 + 632 ^ 15 (3.23) 

7xi X2 5^2 ^ 25 (3.24) 

2xi 2x2 563 ^ 14 (3.25) 

Si + 32 ^ 1 (3.26) 

33 = 1 (3.27) 

Solving the LP relaxation of this model gives 

XI = 3.086, X2 = 1.414, 3i= 0.603, 32 = 0.397, 33 = 1, 

Objective = 12.086 (3.28) 



Proceeding (e.g. by branch-and-bound) to the integer optimum gives 



Xi=2— , X 2 =l— , 3i = 0, 32=1, 33 = 1 , 

7 

Objective =11 — 

•’ 12 



(3.29) 
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It is instructive to view these solutions geometrically (in the space of xi and X 2 ) 
as is done in Fig. 3.4. 



= 1=2 




Fig. 3.4 A disjunctive programme 

The optimal integer solution corresponds to vertex F in Fig. 3.4. Note that if this 
had represented a non-linear problem vertex H would have represented a local opti- 
mum (with a lower objective value). Traditional non-linear programming algorithms 
could well produce this solution. If we use branch-and-bound on the IP formulation 
then vertex H will correspond to another integer solution which might well be pro- 
duced in the tree search before that at vertex F is found. 



3.2.2 Mixed IP Representability 

When is a problem representable as a MIP? We are going to restrict our attention 
to the possibility of modelling using bounded integer variables. In Chapter 2 it was 
shown that bounded integer variables can be expressed using only 0-1 integer vari- 
ables. Hence we consider when it is possible to model, using continuous and 0-1 
integer variables only. Such problems are sometimes referred to as bounded MIP 
representable but we will simply refer to these problems as MIP representable. We 
have seen that, in some circumstances, e.g. Example 3.2, it is possible to model a 
logical statement involving LP constraints as a conjunction of MIP constraints. But 
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we relied on some of the numerical bounds given in (3.1) in order to do this. If we 
know such bounds then it is possible to model the conditions in this way (possibly 
approximating strict inequalities if necessary). In some circumstances we do not 
know such bounds. Also the objective function may make the problem non-MIP 
representable. It may still be possible to model as a MIP but only if certain con- 
ditions apply, which we discuss here. Before doing this we give a simple logical 
condition which is not MIP representable. 

Example 3.4 A non-MIP-representable condition 



X = 0 V y = 0 



(3.30) 



X, y 6 M 



Since we do not have bounds on the values of x and y we cannot, obviously, 
model this disjunction by introducing 0-1 variables. Even if we place single bounds 
on each variable (e.g. lower bounds of 0) the condition is still not representable. 

Example 3.5 A MlP-representable disjunctive programme with an open feasible 
region 
Minimise 

x-l-3y (3.31) 

subject to 

/2x - y ^ -3\ 

I -X -h y ^ 2 1 V 
V X ^ 0 / 



X, y 6 K 



4x — 2y ^ 4 
-2x + 2y ^ —2 
X + y ^ 2 



V 



6x — 3y ^ 0 
— 3x -t- 3y ^ 1 
2x 4- 3y ^ 3 



(3.32) 



The feasible region is illustrated in Fig. 3.5. It can be seen that each clause in 
(3.32) gives rise to an open feasible region, as therefore does their union. However, 
Example 3.5 is MIP representable, unlike Example 3.4. 

A MIP model representing Example 3.5 is 
Minimise 



X 4- 3y 



(3.33) 
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/ 

A/ 

/ 




Fig. 3.5 Disjunctive constraints with an open feasible region 



subject to 



-X + y ^ -1 + 3Si 


(3.34) 


X ^ 0 


(3.35) 


4x — 2y ^ —6 + IOS 2 


(3.36) 


8x -j~ 8y ^ 9 “b 7 S 2 


(3.37) 


6x — 3y ^ —9 + 9 S 3 


(3.38) 


—3x + 3y ^ —3 + 4 S 3 


(3.39) 


2x + 3y ^ 3 


(3.40) 


i5i + S 2 -b S 3 ^ 1 


(3.41) 



Si,S2,53e{0, 1} 



Constraint (3.41) forces at least one of Si , S 2 , 53 to take the value 1 . It can be ver- 
ified that Si = 1 forces the constraints in the first clause of (3.32) to hold (but does 
not exclude the corresponding constraints holding in the other clauses). Similarly 
S 2 = 1 and S 3 = 1, respectively, force the second and third clauses to hold. 
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Why is Example 3.5 MIP representable while Example 3.4 is not? The reason 
is that the feasible region of Example 3.5 is the union of polyhedra with the same 
recession directions. A recession direction of a polyhedron (in this two-dimensional 
example) is a direction (p,q) such that for any point (a, b) in the polyhedron all 
points (a + Xp, b + Xq) for all A. ^ 0 are also in the polyhedron. Recession directions 
are drawn for a polyhedron in Fig. 3.6. 

B 




Fig. 3.6 Recession directions of an open polyhedron 

Possible recession directions are those between the extreme rays AB and CD. In 
Fig. 3.5 it can be seen that the three polyhedra, whose union makes up the feasible 
region, have exactly the same recession directions since they all have extreme rays 
pointing in the same directions. We show below why this property makes the cor- 
responding optimisation problems MIP representable. Before doing this, however, 
observe that Example 3.4 can be regarded as the union of two polyhedra x = 0 and 
y = 0 each of which have two, different, recession directions. Hence this example 
is not MIP representable. 

If all the polyhedra, whose union makes up the feasible region of a problem, are 
closed then each of the polyhedra has no recession directions and the conditions are 
satisfied vacuously making the problem MIP representable. This is the case with 
Example 3.3. 

Although all the above discussion has been illustrated by two-dimensional exam- 
ples the concepts and results extend to any number of dimensions. 

For completeness we solve the model resulting from Example 3.5. 

The LP relaxation solution is 



x = 0.719, y = 0.521, = 0.267, 32 = 0131, 33 = 0.602, 

Objective = 2.281 (3.42) 
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The integer optimum is 

2 11 3 

X = , ^1=0, 32 = 0, 33 = 1, Objective = 2 - (3.43) 

The integer solution (3.43) corresponds to the point F in Fig. 3.5. Clearly the 
third clause in the disjunction (3.32) is true. 

We now show that the same recession directions for a set of polyhedra guarantees 
their disjunction is MIP representable. 

If a set of polyhedra all have the same recession directions then an open facet 
of any of the polyhedra must have extreme rays which are contained in each of the 
other polyhedra. Consider an open facet from one of the polyhedra defined by the 
constraint 

'^aijXj'^bi (3.44) 

j 

Suppose we were to 

Minimise 

Y^aijXj (3.45) 

j 

subject to the constraints defining each of the other polyhedra in turn. If any of 
the resulting LPs were unbounded, then the directions along which (3.45) reduces 
indefinitely would define extreme rays of the other polyhedron. But such extreme 
rays must also be a recession directions of (3.45) contradicting the unboundedness. 
Flence we can determine finite values of m, . 

The optimal value of (3.45) is determined by its value at the vertex at the origin 
of this extreme ray (so long as the facet is ‘pointed’. If not we can choose a value 
at any point in the polyhedron). Hence we can obtain a (finite) lower bound, mi, on 
the value of , aijxj. Similarly we can find finite lower bounds m 2 , ..., m„, on 
the values of the left-hand sides of the other ‘open’ constraints. 

Note that if the number of distinct recession directions, on an open facet, is at 
least as great as the dimension of that facet then the corresponding facets must all 
be parallel and the m, can be taken as mint^, (i><:) (Exercises 3.7.2 and 3.7.3). This 
is the case for Example 3.5. 

We can force all of the constraints in each conjunctive clause to hold by setting 
the corresponding 0-1 variable 3, = 1 in MIP constraints 

'^aijXj - (bi - mi)&i ^ Mi Vi 
J 



(3.46) 
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together with 



(3.47) 

i 

i.e. the same 0-1 variable is used for each constraint (open and closed) in a conjunc- 
tive clause. 

Note that we can, in effect, confine ourselves to the same m, (or M, for 
constraints) which we would have used in the ‘closed parts’ of the corresponding 
polytopes. 

For constraints that do not contain extreme rays of the polyhedra we can obtain 
bounds on the values of the variables in them and formulate the corresponding MIP 
constraints as described in Sect. 3.1. 

Conversely, if we have a MIP representation of a model, then a typical constraint 
will be 



Y^ajXj-Y^bA^b (3.48) 

jeJ iel 



Xy-eM, 5,e{0, 1} VjeJ,ieI 



Setting the S, to different values gives a series of parallel constraints. 

Some, or all, of these constraints may represent closed hyperplanes. Some may 
be redundant in the presence of other constraints. However, if any represent open 
facets, of some of the component polyhedra in the implied disjunction, they must 
all represent open facets. What’s more they will be bounded by some extreme rays 
which must be parallel to the corresponding extreme rays in the other polyhedra 
in the implied disjunctions. This argument applies to all the constraints in (3.48). 
Hence the constraints in (3.48) represent a disjunction of polyhedra that are either 
all closed or, if not, have the same recession directions. 

It remains to consider when the objective function, taken in conjunction with the 
constraints, still makes a problem MIP representable. If the objective is linear and 
expressed in the original variables (not involving the appended 0-1 variables) then 
there is no problem. This is the case with Examples 3.3 and 3.5. However, it may 
be the case that we wish to model a piecewise linear or a discontinuous objective. 
For example the non-linear expression given in Example 2.17 might have occurred 
in the objective function. It would then be replaced by a variable, with the necessary 
extra variables and constraints added to the model. The representability problem 
would then be the same as that for a logical system of constraints. If the objective is 
discontinuous then the situation may be more complex. This happens, for example, 
in the facility location problem discussed in Example 2. 12. This problem is a special 
case of the fixed charge problem. We consider this problem in its simplest form in 
Example 3.6. 




84 



3 Modelling in Logic for Integer Programming 



Example 3.6 Formulate the following fixed charge problem as a MIR 

If an activity is carried out at a level x > 0 then it incurs a fixed cost /, irrespec- 
tive of the level. However, if x = 0 then there is no fixed cost (and no revenue). To 
be realistic we might give the activity x a (positive) revenue of p. 

Since we normally wish to minimise total cost (explicitly, or implicitly by 
maximising profit) we minimise z such that 

z ^ 0 if X = 0 
f — px ifx>0 
z, xe K 

Equations (3.49) and (3.50) define the polyhedra, in Fig. 3.7, consisting of the 
positive z-axis and the region above the line AC. The second polyhedron clearly 
has recession directions different to that of the first polyhedron (which has a single 
recession direction). Therefore this polyhedron, and therefore the unbounded fixed 
charge problem, is not MIP representable. However, if we place an upper (M) bound 
on the value of x then the problem is MIP representable as the second polyhedron 
has the same, single, recession direction as the first. The union of these two polyhe- 
dra is known as the ‘epigraph’ of the cost function. So long as the epigraph is MIP 
representable (and the feasible region is MIP representable) then the minimisation 
problem is MIP representable. This gives rise to the model 

Maximise 



(3.49) 

(3.50) 



px - fS (3.51) 

subject to 

X ^ MS 
X ^ 0 

xeK, (5e{0, 1} 

Note that for the facility location model, in Example 2.12, both the customer 
demands and the warehouse capacities place upper bounds on the x variables, mak- 
ing the problem MIP representable. 



(3.52) 

(3.53) 



3.3 Alternative Representations and Tightness of Constraints 

Alternative representations of statements in the propositional calculus were dis- 
cussed, at length, in Chapter 1. These different representations lead to different IP 
formulations. Example 2.10 demonstrates this. 
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Fig. 3.7 Cost function for the fixed charge problem 



Equation (2.80) models the logical condition 



(3i = 1 V 32 = 1) — ^ S 3 = 0 



(3.54) 



This statement is logically equivalent to 



(3; = 1 33 = 0) ■ (32 = 1 



33 = 0 ) 



(3.55) 



(Exercise 3.7.4 is to verify this.) 



However (3.55) leads, naturally, to the conjunction of constraints (2.82) and 
(2.83). Eigures 2.12 and 2.13 demonstrate that this latter IP formulation represents 
the convex hull of feasible integer solutions, allowing one to use computationally 
easier LP. 

We now investigate the respective merits of IP formulations resulting from ONE 
and CNE representations of logical conditions. Eollowing Chapter 2 we refer to 
convex hull formulations as ‘sharp’ and formulations that are closer to the convex 
hull as ‘tighter’. 
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3.3.1 Disjunctive Versus Conjunctive Normal Form 

Example 3.7 Formulate Example 3.2 in DNF and create, and solve, the correspond- 
ing IP model. 

The logical representation is 



/ 2xi -f- X2 < 6 \ / xi -f 3x2 < 9 \ 

I X2 < 4 1 V I 7xi + X2 < 20 1 

y 2xi -t- 2x2 ^ 9 / y 2xi + 2x2 ^ 9 / 
Xi, X2 ^ 0 



(3.56) 

(3.57) 



xi, X 2 eM 



Using objective (3.20) and representing the two clauses in the disjunction by 0-1 
variables i5i and S 2 , respectively, give the model 
Maximise 

3xi + 2x2 (3.58) 

subject to 



2xi X 2 4(5 1 10 


(3.59) 


X2 < 4 


(3.60) 


2xi “h 2x2 "h 5^1 ^ 14 


(3.61) 


xi + 3x2 + 6(52 ^ 15 


(3.62) 


7xi -|- X 2 -|- 5(52 ^ 25 


(3.63) 


2xi 2x2 “h 5^2 ^ 14 


(3.64) 


+ 82 ^ 1 


(3.65) 


Xi, X2 ^ 0 


(3.66) 


xi, X 2 eM 




(5i, (526(0, 1} 





If we solve this alternative formulation we hrst obtain the LP relaxation solution 

xi = 2.759, X 2 = 2.795, = 0.422, 82 = 0.578, Objective = 13.867 

(3.67) 

followed by the integer optimal solution given in (3.29). 

The difference between the solution in (3.67) and (3.29) (the duality gap) is 
greater than that between (3.28) and (3.29). This indicates that this new formulation 
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is not as tight as the other formulation. In this sense it is not as good. In general 
branch-and-hound would take longer to obtain the integer optimum. 

At the opposite extreme to the DNF given in Example 3.7 we consider CNF for 
this example. 

Example 3.8 Express the logical statement in Example 3.7 (and 3.2) in CNF and 
formulate and solve the resultant IP model. 

In CNF we have 



[2xi + X2 ^ 6 V xi + 3x2 ^ 9] 

■ [2xi + X 2 ^ 6 V 7xi + X 2 ^ 20] 

• [x2 < 4 V xi + 3x2 < 9] 

■ [x 2 ^ 4 V 7xi + X 2 ^ 20] 

•2xi + 2 x 2 ^ 9 

xi, X 2 ^ 0 (3.68) 

Xi, X 26 R 

Introducing 0-1 variables & 2 i, ^22 to distinguish between the constraints 

in the disjunctions (i.e. index ij refers to the ith constraint in the yth clause of the 
DNF form of problem (3.56)) gives the model 
Maximise 

3xi + 2x2 (3.69) 

subject to 



2xi + X 2 "h 45p ^ 10 


(3.70) 


x\ + 3 x 2 + 6 S 12 ^ 15 


(3.71) 


X 2 < 4 


(3.72) 


7xi + X 2 + 5^22 ^ 25 


(3.73) 


2xi + 2x2 ^ 9 


(3.74) 


+ S12 ^ i 


(3.75) 


+ 522 ^ 1 


(3.76) 


52i + ^ 1 


(3.77) 


52i + 52j ^ 1 


(3.78) 


XI, X 2 ^ 0 


(3.79) 



xi, X2eK 
3i,32e{0, 1} 



Solving the LP relaxation of this model (as with the DNF-based formulation) 
produces the solution (3.28), i.e. the same as for the DNF formulation showing them, 
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in this case, to be of equal strength. It can be shown that a DNF-based formulation 
(so long as the same M and m bounds are used for each constraint) will always be at 
least as tight, and sometimes tighter, than a CNF-based formulation. The reason for 
this is explained below. What’s more the CNF formulation will often involve more 
0-1 variables than the DNF formulation since we need r (or strictly |'log 2 r]) 0-1 
variables to model each disjunction. If we have a conjunction of n such disjunctions 
then we will require a total of rn (or [log 2 ) 0-1 variables, whereas in DNF a 
conjunction of m disjunctions could be modelled with m 0-1 variables. In practice 
substantial simplifications may be possible in either type of formulation, and the 
size of the CNF or DNF representation will depend on the problem, as discussed 
in Chapter 1 . The tightness of the resultant IP is often of greater importance than 
the compactness of the model in terms of number of variables (and constraints). 
In Sect. 3.4 we show how it is possible to guarantee the production of a sharp IP 
formulation of a disjunctive programme, but often at the expense of a very large 
number of 0-1 variables. 

In order to show why the LP relaxation associated with a DNF formulation is 
always at least as tight as that associated with a CNF formulation we consider 
general logical statements of constraints in both forms. For convenience we will 
consider all the constraints in the form. 

In DNF we have sets (indexed by j) of conjunctive clauses (indexed by ij) each 
consisting of constraints at.tXi.k ^ bi. for ij = 1,2, ..., . These will be writ- 

ten (for economy of notation) as /, ^ 0 . 

We introduce 0-1 variables &j. Each &j, when taking the value 1, forces all the 
constraints in the yth set to hold. The disjunction is formulated as 

(3.80) 

J 

The constraints are forced to hold by the MIP constraints 

fij + (Mi. - bi.) Sj < Mi. - bi. Vj ij (3.81) 

where the M, are upper bounds on the values of the ft . + bi. as derived in (3 . 1 0) or 
(in form) (3.46). 

In CNF, using the distributive rule (1.7), each original constraint will appear in 
each of the disjunctive clauses. Hence it is necessary to force each of them to hold 
by separate 0-1 variables &i. taking the value 1 giving 

fi. + (Mi^ - bi.) Si. < Mi. - bi. Vj ij (3.82) 

together with the constraints (for each disjunctive clause) 
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If we now impose the following extra constraints on this, CNF-hased, 
formulation 



Si.^Sj wij (3.84) 

and so eliminate variables Si we obtain the DNF-based formulation. Hence a DNF- 

h 

based formulation is always as tight as a CNF-hased one and for some problem 
instances will be tighter. 

When it is worth moving from CNF to DNF, in order to produce a tighter for- 
mulation, may depend on the particular problem. Even if a problem is not naturally 
stated in DNF or CNF it is usually better to move the conjunctive connective ‘in’ 
and the disjunctive connective ‘v’ ‘out’ (using the distributive laws) in the logical 
representation so approximating to DNF. 

Exercise 3.7.5 is to reformulate and solve Example 3.5 using CNF. It turns out 
that, for that example, we obtain the same LP relaxation solution (in the space of 
X and y) as that from the DNF formulation. 

However for both Examples 3.2 and 3.5 we will show, in Sect. 3.4, a formula- 
tion whose LP relaxation always gives the optimal integer solution for all problem 
instances. 

3.3.2 The Dual of a Disjunctive Programme 

There is no, very satisfactory, dual of a general IP model. The most satisfactory 
dual of a PIP is the Chvatal dual discussed in Chapter 2. However, if a disjunctive 
programme is written in DNE then it has a fairly obvious dual. In order to illustrate 
this we define the dual of Example 3.7. 

Example 3.9 Define the dual of Example 3.7. 

We take the LP duals corresponding to each of the clauses in (3.56) taken with the 
objective (3.58) and take the solution giving the larger objective value. This gives 

Minimise 



Max(6yi -|- 4y2 + 9y^, 9j4 + 20y5 -h 9ye) 


(3.85) 


2yi +2ys ^3 


(3.86) 


yi + yi + 2ys ^ 2 


(3.87) 


T4 + 7ys -h 2ye ^ 3 


(3.88) 


3y4 + ys + 2y6^ 2 


(3.89) 


yi, yi, ys, ys, ye ^ 0 


(3.90) 


yi, y 2 , y3, y4, ys, ye e K 





subject to 
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Notice that, for this model to be feasible, the dual models corresponding to each 
of the clauses in the primal model must also be feasible. However, the original, 
primal, model could be regarded as feasible even if some of its clauses (up to 
all except one) were infeasible. As pointed out in Chapter 2 it is possible for an 
LP to have both its primal and dual infeasible. If this were the case for the LP 
resulting from one of the clauses we could have a feasible disjunctive programme 
with an infeasible dual. To get around this we could stipulate a ‘regularity’ con- 
dition for the duality theorem to apply, i.e. for no clause should both the primal 
and dual LPs be infeasible. Alternatively (and more satisfactorily) we could define 
the maximum and minimum objective values for an infeasible LP as — oo and oo, 
respectively. 

There is a useful notation for representing the above dual model. We can repre- 
sent the operation Min(a, b) as a0ii (analogous to ‘-|-’) and the operation Max(a, b) 
as fl 0' i>. The above dual model can then be written as 

Minimise 

(6yi -h 4^2 + 9ys) 0' (9y4 -h 20^5 -h 9ye) (3.91) 

subject to 

(2yi + 2y3) 0 {y^ + lys + 2ye) ^ 3 (3.92) 

(yi + + 2y3) 0 (3y4 + ys + 2ye) ^ 2 (3.93) 

yi, yi, ys, y4, ys, y& ^ 0 (3.94) 

yi,yi, y3, ys> ye e K 



This can be regarded as an extension of the dual of an LP. There as we ‘pass 
over’ a conjunction, looking down a column, the corresponding dual operation is 
‘+’. Here we extend this so that when we pass over a disjunction the dual operation 
is ‘0’. The need to use ‘0’ ‘(min)’ and ‘0'’ ‘(max)’ arises from the fact that the 

right-hand-side coefficients in an LP are conventionally written on the right (!). If 
they had been written, together with the variables, on the left then we could have 
sufficed with ‘0’. 

Obviously we have to amend our definition of the dual of a disjunctive pro- 
gramme to deal with the case of a minimisation (of the primal model) and ‘^’ 
constraints. This is straightforward and not covered here. 

This dual of a disjunctive programme obviously captures many of the features of 
the dual of an LP, stated in Sect. 2.1, i.e. equal objective values if both are solvable 
(subject to the proviso discussed) and symmetry (it can be shown that the dual of 
the dual is the primal: see Exercise 3.7.7). 

An algebra, known as ‘minimax’ algebra has been developed around the use of 
symbols ‘0’ and ‘0'’. References are given in Sect. 3.6. 
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3.4 Convexiflcation of an IP Model 

The concept of a convex hull can be extended beyond that of the smallest convex 
region containing feasible integer points. If we restrict ourselves to considering 
the disjunctions of constraints in the original space in which they are stated 
(i.e. xi and X 2 in the case of Example 3.2 and x and y in the case of Example 3.5) 
then we can define the convex hull of the disjunction of the resultant polyhedra. It 
is the smallest convex set containing all the polyhedra making up the disjunction. 
We draw the convex hulls corresponding to the disjunction of polyhedra in Eigs. 3.4 
and 3.5 in Figs. 3.8 and 3.9, respectively. 

^2 




If we could append the extra constraints to represent the convex hulls then we 
could solve these examples as LPs producing the vertex solutions at F and F, respec- 
tively. In general (when there is no visualisable geometrical representation), the 
creation of constraints representing convex hulls is very difficult. We have given 
different IP formulations involving the 0-1 variables. These have different convex 
hulls (in these higher dimensional spaces). It is sometimes possible to project their 
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LP relaxations down (using the method described in Sect. 2.1) into the spaces of the 
original problems to give approximations to the convex hulls. 

However, there is another, powerful, method of reformulating a disjunctive pro- 
gramme to give its convex hull representation by introducing (possibly a very large 
number of) new variables. 



3.4.1 Splitting Variables 

Example 2.10 demonstrates that ‘disaggregating’ constraints can improve the tight- 
ness of the LP relaxation. This can be taken further if we first split variables, in 
the original space of a disjunctive programme, into components. We demonstrate 
this by using Example 3.5 to produce what has become known as a ‘disjunctive’ 
formulation (although all the formulations involving ‘v’ are, in a sense, disjunctive). 



Examp/e 5.70 Formulate Example 3.5 as a MIP by splitting the variables 
into components for each clause in the disjunction to produce a ‘disjunctive’ 
formulation. 
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Each of the variables x and y is split into a component for each of the three 
(conjunctive) clauses in (3.32) by the constraints 



xi + X 2 + X 3 — X (3.95) 

yi + yi + ji^y (3.96) 



Three 0-1 variables are used to apply the clauses of the disjunction in the follow- 
ing way: 



2j:i - yi ^ -3Si 


(3.97) 


-xi +yi ^ 28i 


(3.98) 


W 

0 


(3.99) 


4X2 - 2y2 ^ 482 


(3.100) 


—2x2 + 2y2 ^ —282 


(3.101) 


X2 + y2^ 282 


(3.102) 


6x3 - 3 j3 ^ 0 


(3.103) 


—3x3 + 3y3 ^ 53 


(3.104) 


2x3 + 3y3 ^ 3Sj 


(3.105) 


81 + 82 + 83 = 1 


(3.106) 



By virtue of (3.106) exactly one of 81 , 82 , Sj, will take the value 1, the others 
taking the value 0. Hence one of the clauses will hold for the corresponding com- 
ponent variables. For the other clauses the right-hand sides of the constraints (in the 
corresponding component variables) will become 0. 

Suppose, for example, = 0, ^2 = 1, i53 =0 then the first of the extreme ray 
constraints in each clause are 



2xi-yi^0 (3.107) 

4x2-2j2^4 (3.108) 

6;c3-3y3^0 (3.109) 

Adding these constraints together (in suitable multiples), and using (3.95) and 
(3.96) gives 



4x-2y^4 (3.110) 

i.e. the first extreme ray constraint in the second clause, in the space of the original 
variables. 

Similarly, adding together the second extreme ray constraints in each clause gives 



— 2 x + 2 y ^ —2 



(3.111) 
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i.e. the second extreme ray constraint in the second clause, in the space of the origi- 
nal variables. 

Since the extreme ray constraints corresponding to i5i =0 and ^3 = 0 go through 
the origin they form a cone in the space of the appropriate component variables. The 
non-extreme ray constraints, in the components corresponding to these clauses, also 
go through the origin. They are 



(3.112) 

(3.113) 

(3.114) 

We illustrate the situation, for the third clause, in Fig. 3.10. 



x\ ^ 0 
xi + yi'^O 
2X3 + 3 j 3 ^ 0 



% 




This implies that the component variables in the clauses for which Si — 0 are 0 
when these constraints are satished as equalities. Therefore, as a result of (3.95) and 
(3.96), the appropriate constraints apply for the relevant clause in the space of the 
original variables. For this example we would have 



X + y ^ 2 



(3.115) 
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The constraints of (3.34)-(3.41) are therefore relaxations of (3.97)-(3.106). 
Hence our new (disjunctive) formulation is at least as tight as the DNF formulation. 
Solving the LP relaxation of this new formulation gives the integer solution (3.43) 
showing it to be sharp. 

If all the component polyhedra of a disjunction are closed then all except one of 
the polyhedra, in the component variables, reduce to the point at the origin, when 
the corresponding variables are 0. Therefore the clause corresponding to 5, = 1 
applies in the space of the original (non-split) variables. 

Although this type of formulation introduces many new variables (a component 
for each variable in each clause, whether or not the clause contained the variable 
in the first place), in practice many of them can often be shown to be redundant 
leading to a vast reduction in the size of the model. Such reductions depend on the 
individual type of model and are difficult to generalise. 

A ‘disjunctive’ formulation is always sharp (it models the convex hull). We now 
show why this is the case. In order to do this we consider the dual of the LP relax- 
ation of the model in Example 3.10. 

Example 3.1 1 Create the dual of the LP relaxation of the model created in 
Example 3.10, together with objective (3.31) (written in the component variables). 

Representing the dual variables on constraints (3.97)-(3.106) as u\,U 2 ,u^, 
Ui, i> 2 , i> 3 , W\, W 2 , 1 U 3 , z we obtain the model 

Maximise 



z (3.116) 

subject to 



z < — 3 mi -f- 2 u 2 


(3.117) 


z < 4ui - 2u2 -I- 2i>3 


(3.118) 


Z < W2 + 3 W 3 


(3.119) 


2mi — M2 -h M3 ^ 1 


(3.120) 


—Ml 4- M2 ^ 3 


(3.121) 


4ui — 2 u2 -h l>3 < 1 


(3.122) 


-2vi -t- 2 i> 2 -h U3 < 3 


(3.123) 


6wi — 3 w2 + 2x1)2, ^ 1 


(3.124) 


— 3w;i + 3x1)2 -h 3 w 3 < 3 


(3.125) 


Ml, M2, M3, VI, V 2 , U3, XVI, XV 2 , IU3 ^ 0 


(3.126) 



Ml, M 2 , «3, V\, l> 2 , l> 3 , U> 1 , IV 2 , IV 2 , eM. 



Using the notation introduced in Sect. 3.3 we can write the above model as 
follows: 
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Maximise 



(-3mi + 2m 2) 0 (4i>i - 2v2 + 21 ) 3 ) © (W2 + 3w2) (3.127) 

subject to 

(2mi — U2 + M 3 ) ©' (4ui — 2v2 + V 3 ) ©' (bull — 3w2 + 2 .W 2 ) < 1 (3.128) 

(-Ml + M 2 ) ©' (-2ui + 2 u2 + P 3 ) 0 ' (-3m;i + 3w2 + 3w3) < 3 (3.129) 

Ml, m 2 , M 3 , Pi, V 2 , P 3 , iv\, W 2 , 1 P 3 ^ 0 (3.130) 



Ml, m 2, M3, Pi, P2, P3, iPi, 1V2, 1P3 e M 



This is obviously the dual of the disjunctive programme stated in Example 3.5. It 
therefore has the same optimal objective value (so long as both models are feasible). 
Hence the LP dual of the model above, namely the LP relaxation of the MIP model 
created in Example 3.10, is an LP representation of the disjunctive programme in 
Example 3.5, i.e. our ‘disjunctive’ formulation (by splitting variables) is sharp. This 
demonstration is entirely general and applies to any disjunctive programme. We 
have represented the convex hull of the disjunctive formulation (in a higher dimen- 
sional space). It is possible (but often of exponential complexity) to project this 
formulation back into the space of the original variables (e.g. using Eourier-Motzkin 
elimination, as described in Chapter 2). 

We should also point out that the possibility of this type of formulation does 
not affect the issue of MIP representability, discussed in Section 3.2. This type of 
formulation is possible if and only if the problem is MIP representable. 



3.5 Modelling Languages Based On Logic 
3.5.1 Algebraic Languages 

The use of modelling languages to aid the building of LP and IP models has now 
become widespread. These languages typically use index sets to refer to variables 
and constraints in the model, so allowing classes of variables and constraints to be 
referred to once. They also allow the structure of the model to be separated from 
the data. It then becomes possible to generate variables, constraints and objectives 
conditional on the data, e.g. if particular numerical conditions hold or entities are 
members of particular sets. In order to do this logic, in the form of the propositional 
calculus, is usually incorporated into the language. 

Pull details of any one language are specific to the software system used and 
described in the appropriate manual. However, we illustrate typical logical facilities 
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by means of the commonly used language AMPL (A Mathematical Programming 
Language). 

This language allows data to be defined in terms of numerical scalars, vectors 
and arrows as well as index sets. It is then possible to make statements about the 
data which may be true or false. Compound propositions can then be constructed 
from these statements. In AMPL the possible ‘connectives’ which may be used are 



or, not, and, if ■ ■ ■ then . . . else, exits, forall (3.131) 



obviously representing respectively. 

A typical statement in AMPL might be 



subject to (3.132) 

Capacity{i in MACHINES] : sum(j in P RO DUCT S) 

T ime[i, J] * x[i, j] 

< = (exists (Time[i, j] > 0) 
and not{i in P RECLU DE}) 

Cap[i] 



This statement generates a number of constraints indexed by the set MACHINE S. 
The number of each product (indexed by P RO DUCT S) produced is represented 
by an LP (or IP) variable x[i, j]. Each variable has a coefficient Time[i, J] taken 
from a data array. The capacity (in hours) of each machine is given in the data 
vector Cap[i]. 

However, the constraint will only be generated if the following statement is true: 



exists (T ime[i, j] > 0) and not{i in P RECLU DE] (3.133) 



i.e. both at least one of the products takes a positive time on the relevant machine 
and the machine is not one of a precluded set for which no capacity constraint is 
needed. 

(Note that the names of the sets and data arrays have been chosen to make the 
statement meaningful. They have no significance in the language.) 

This language is typical of the ‘state of the art’ in such algebraic languages. It is 
also possible, in some languages, to generate constraints using the rich language of 
constraint logic programming, as described in Chapter 4. 

We now describe a language and method of generating IP models which goes 
beyond what can be achieved by purely algebraic modelling languages. 
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3.5.2 The ‘Greater Than or Equal’ Predicate 

We define the predicate ge{r, Pi, P 2 , . . . , P„) as meaning greater than or equal to r 
of the statements P\, P 2 , ■ ■ ■ , Pn are true. (This predicate could equally well have 
been called ‘at-least’.) This predicate is itself a statement which will be true or false. 
(Obviously if r > n it must be false.) 

If /- = 1 it represents the disjunction 



PiV P 2 W ■■■V P„ (3.134) 

and if r = n it represents the conjunction 

Pi Pi Pn (3.135) 

By taking values of r between 1 and n we can represent compound statements 
between (multivalued) ‘v’ and Such statements could, of course, be written 
in terms of the conventional connectives but such representations would be more 
cumbersome. 

What’s more the statements using ‘ge’ can be nested, i.e. the component state- 
ments Pi can themselves be ge predicates. 

We illustrate this by an example 

Example 3.12 Represent the following statement in DNF using conventional con- 
nectives: 



ge{2, ge(l, Pi, P 2 ), P 3 , ge{2, P 4 , P 5 )) (3.136) 

we can progressively reexpress (3.136) (from ‘the bottom up’) as follows: 

ge{2,PiVP2,P3,P4-P5) (3.137) 

(Pi V P 2 ) ■ P 3 V (Pi V P 2 ) ■ P 4 ■ P 5 ) VP 2 -P 4 -P 5 (3.138) 

Pi - P 2 V P 2 - P 3 V Pi - P 4 - P 5 V P 2 - P 4 - Ps-V Pi- P 4 - P 5 (3.139) 



At the bottom level the component statements will be LP or IP constraints or 
0-1 variables which can individually be true or false. The representation entirely 
in terms of nested ge predicates gives a ‘normal form’ (although it is not unique) 
into which all models can be put. The translation into an IP model can then be 
automated. We describe how this may be done below. Before doing this, however, 
we point out that nested ge predicates are not a natural way of modelling a problem. 
Instead we use a richer set of predicates, together with the conventional connectives 
of the propositional calculus. Then we translate into the ge form before translating 
into an IP model. 

Useful predicates which we use (in addition to the conventional connectives and 
ge itself) are as follows: 
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le{r, Pi, P2, . . ■ , Pn) meaning less than or equal to r of P2, ■ ■ ■ , Pn 
are true. 

eq{r, P\, P2, . . . , P„) meaning exactly r of Pi, P2, . . . , P„ are true. 
gef(r. Pi, P2, . ■ ■ , Pn) meaning greater than or equal io r of Pi, P2, ■ ■ ■ , Pn 
are false. 

lefir. Pi, P 2 , ■ ■ ■ , Pn) meaning less than or equal to r of Pi, P2, ■ ■ ■ , Pn 
are false. 

eqf{r. Pi, P2, ■ ■ ■ , Pn) meaning exactly r of Pi, P2, . . . , Pn are false. 
le{r. Pi, P2, ■ ■ ■ , Pn) meaning less than or equal to r of Pi, P2, ■ ■ ■ , Pn 
are true. 

atAeast{r, Pi, P 2 , ■ ■ ■ , Pn) meaning at least r of Pi, P2, . . . , Pn are true, 
i.e. ge(r, Pi, P2, . . . , P„). 

at_most(r. Pi, P2, ■ ■ ■ , Pn) meaning at most r of Pi, P2, ■ ■ ■ , Pn are true, 
i.e. Ze(r, Pi, P2, . . . , P„). 

all(Pi, P2, . . . , Pn) meaning all of Pi, P2, . . . , Pn are true, i.e. ge{n. Pi, 
P2, ...,Pn). 

none(Pi, P2, . ■ ■ , Pn) meaning none of Pi, P2, . . . , Pn are true, i.e. ge( 0 . Pi, 
P 2 , ...,Pn). 

These predicates are for modelling purposes. Ultimately they will be converted 
(automatically) into the ge form, within the system, by rules given below. In addition 
we may wish to use the rich variety of predicates available from constraint logic 
programming, as described in Chapter 4 . 

The conversion of the above predicates to ge form is straightforward and 
left as Exercise 3 . 7 . 7 . For illustration, however, we consider the le predicate 
le(r. Pi, P2, . . . , Pn) converts to ge{r + 1 , Pi, P2, . . . , Pn)- 

Negations of a ge can always be moved inwards (a generalisation of De Morgan’s 
laws), i.e. ~ge(r. Pi, P2 , . . . , P„) converts to ge(n — r + 1 , Pi, P2, . . . , P«). 

Models resulting from this approach to IP formulation may, sometimes, be 
simplihed. We point out two obvious simplifications here which we refer to as 
‘flattening’ 

ge{l,ge(l,Pi,P 2 ,...,Pn.), ( 3 . 140 ) 

Pm-\-l, Pm-1-2, • • • , Pn) 

can be rewritten as 

M 1 ,Pi,P 2 ,...,P„) ( 3 . 141 ) 

and 

ge(n + 1 , ge(m. Pi, P2, . . . , P^), ( 3 . 142 ) 

Pfn-l - 1 ? Pm-1-2 , - - - , Pn ) 

can be rewritten as 

ge(n,Pi,P2,---,Pn) ( 3 . 143 ) 



so removing one level of nesting in each case. 
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These two simplifications are the result of the associativities of ‘v’ and 
respectively. 

We demonstrate how to use the ‘ge’ predicate to create an IP model by the fol- 
lowing example. 

Example 3.13 Model the following condition among constraints by a MIP model: 

'^^aijXj ^biV ''^^aijXj ^ i>2 ^ ^ ^3 ^ ^ b4 (3.144) 

j j j j 



It is convenient to represent the four inequalities by Pi, P 2 , P 3 , P 4 . At the ‘top 
level’ we model the by 

ge( 2 , (Pi V P2) -> (P3 V P4), (P3 V P4) ^ (Pi V P2)) (3.145) 

Proceeding down the component statements give 

ge{ 2 , geiiriPx V P2), (P3 V P4)), (3.146) 

ge(l,'(P3 VP4),(Pi VP2))) 



ge{2,ge{\,ge(2,{Pi,P2)), (3.147) 

ge(l, Pi, P 4 )), 
ge(l, ge( 2 , (Pi, P4), 

ge(l. Pi, P 2 )))) 

By flattening we can simplify (3.147) to 

ge(2,ge(l,ge(2,(Pi,P2)), (3.148) 

P 3 , Pa), 

ge(l, ge(2, (Pi,P4)), 

PuPi)) 

At the bottom level we introduce variables 5i , ^ 2 , ^ 3 , ^4 and model the conditions 

S, = 1 Pi (3.149) 

Assuming suitable bounds are known for the expressions in the inequalities then 
we can model these conditions by the methods given in Sect. 3.1. Otherwise we can 
use the method for open polyhedra described in Sect. 3.2 (so long as the problem is 
MIP representable). Assuming the bounds are known this gives 
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"^aijXj + (Ml — bi)Si < Ml 
j 


(3.150) 




a 2 jXj + {m2 — b 2)&2 ^ m2 
j 


(3.151) 




a^jXj + {mo, — bo,)&o, ^ m3 
j 


(3.152) 




a 4 jXj + (M4 — b4)84 ^ M4 
J 


(3.153) 


Since the predicates ge(2, (Pj, P2)) and ge( 2 , (P3, P4)) are of the form (3.135) 
(conjunctions) each can be represented by single 0-1 variables using the conditions 




S5 = 0 Pi 


(3.154) 




S5 = 0 P2 






Sfi = 0 P3 


(3.155) 




= 0 -o- 4 




giving 


'^^aijXj — {m\ — b\+ s)S5 ^ b\ + s 
j 

- (M2 - ^2 + < ^2 - e 

j 


(3.156) 




(3.157) 




- (M3 - ^3 + e )36 < &3 - e 
j 


(3.158) 




— (^^4 — ^4 + ^)<56 ^ ^4 + £ 
j 


(3.159) 


Moving up one level we model 






ge{l,ge{2,{Pi,P2)), P3,P4) 


(3.160) 


by 


^3 + ^4 — ^5 ^ 0 


(3.161) 


and 


ge(l,ge( 2 ,(P 3 ,P 4 )), A,P2) 


(3.162) 


by 


(5i + ^2 — ^6 ^ 0 


(3.163) 
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No 0-1 variable is necessary at the top level since we have a predicate of the 
form (3.135). Therefore the full condition (3.144) is modelled by (3.150) to (3.153) 
together with (3.156) to (3.159) and (3.161) and (3.163). 



3.6 References and Further Work 

Williams [117] presents the relationship between connectives in the propositional 
calculus and IP constraints. This is also covered in Williams [112, 115, 119]. 
Other authors have also written on the subject such as Barth [12], Hiirlimann 
[58], Dietricht et al. [33], Chandru and Hooker [22], Hooker [54, 56] and Wilson 
[ 121 , 122 ] . 

Balas [5, 7, 9] was the first to develop the theory of disjunctive programming and 
the resultant cutting planes in [6]. Sherali and Shetty [100] also cover the subject. 



The theory of MIP representability was developed by Meyer [82] and Jeroslow 
[62, 63]. Jeroslow also (stemming from the work of Balas) produced the idea of the 
splitting of variables to lead to convex formulations (under the slightly misleading 
title ‘convex formulations’). Jeroslow and Lowe [64] describe modelling techniques 
based on these methods. Martin[78] obtains a number of IP reformulations by means 
of new variables. 

Improved (tighter) IP formulations, by means of splitting variables, exist for a 
number of applications. Sometimes it is possible to (fictitiously) introduce ‘multiple 
commodities’ into a problem where there is only one commodity to increase tight- 
ness. Wolsey [126] does this for the lot-sizing problem and Wong [127] and Claus 
[25] for the network flow formulation of the travelling salesman problem. 

Williams [116] gives the duality-based proof to show that the convex formulation 
is always sharp. Minimax algebra (using the operators ‘0’ and ‘0'’) was developed 
by Carre [21] and Cunningham Green [28]. 

Modelling languages have been developed by a number of authors such as Tourer 
et al. [37] for the AMPL system and Day and Williams [32]. McKinnon and 
Williams [80] developed the modelling system based on the recursive use of the 
‘greater-than-or-equal-to’ predicate. They also show how to avoid introducing a 
component for every variable in every disjunction. Hadjiconstantinou et al. [50] 
present another way of systematically converting a logical expression into an IP 
model. Greenberg and Murphy [48] compare different modelling systems. 



3.7 Exercises 



3.7.1 Model the disjunction of r constraints using [logj r] 0-1 variables as 
described in Sect. 3.1. Is the LP relaxation weaker than that for the formulation 
using r constraints? 
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3 . 7.2 Show that if the open facets of a disjunction of polyhedra contain at least as 
many extreme rays as the number of variables then they represent parallel facets. 

3 . 7.3 Give an example of a set of polyhedra with the same recession directions 
whose open facets are not parallel. Give a MIP representation of the disjunction of 
these polyhedra without splitting the variables. 

3 . 7.4 Show that statement (3.54) is equivalent to (3.55). 

3 . 7.5 Reformulate Example 3.5 in CNF and solve the resultant IP. 

3 . 7.6 Formulate the dual of Example 3.5. 

3 . 7.7 Show that the dual of the dual of a disjunctive programme is the primal. 

3 . 7.8 Convert the predicates in Sect. 3.5 into ‘ge’ form. 

3 . 7.9 Solve Example 3.2 with objective (3.20) using the method described in 
Example 2.3. 

3 . 7.10 Solve Example 3.5 using the method described in Example 2.3. 




Chapter 4 

The Satisfiability Problem and Its Extensions 



Given a statement in logic can it ever be true? This is the satisfiability problem. We 
will confine our attention to the propositional calculus. However, the problem also 
arises in the predicate calculus where it is necessary to consider if there are instan- 
tiations of the variables which make a statement true. The problem is equivalent to 
the inference and consistency problems mentioned in Chapter 1 . 

Suppose we want to decide if 



P^Q (4.1) 

i.e. can we infer Q from P, where P and Q are logical statements or sets of logical 
statements. Equation (4.1) is equivalent to the logical statement 

P ^ Q (4.2) 

(which could also be written as E V 0. 

We simply need to test if (4.2) is always true, i.e. if its negation can never be true. 
The negation can be represented by 



PQ 



(4.3) 



using De Morgan’s laws. 

Hence is (4.3) ever true?, i.e. is it satisfiable? If it is then the inference (4.1) does 
not hold, otherwise it does hold. 

Conversely if we have a satisfiability problem of testing if a statement, or set of 
statements, P is satisfiable we can (trivially) convert it to the inference problem 

P F (4.4) 

If this inference is not true then P is sometimes true, i.e. it is satisfiable. 

If we wish to check if a set of statements 

Pi,P2,P3, ...,P„ (4.5) 
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is consistent then we simply need to test if the statement 

Pi- Pi- P3---Pn 



(4.6) 



is satisfiable. 

Conversely saying P is satisfiable is the same as saying it is consistent. 
Satisfiability problems with a small number of literals could be solved by the use 
of truth tables as follows. 

Example 4.1 Is the following statement satisfiable? 



(Xi ^ {X 2 V X3)) ^ (Xi . Z3) (4.7) 

We can successively fill in the columns of the following truth table (Table 4.1) 



X2 X3 
TXT 
T T F 
TFT 
T F F 
FIT 
F T F 
F F T 
F F F 



Table 4.1 A Truth Table for Satisfiability 



W VX3 
T 
T 
T 
F 
T 
T 
T 
F 



Xi ^ X2 V X3 
T 
T 
T 
F 
T 
T 
T 
T 



Xi .X3 
F 
F 
F 
F 
F 
T 
F 
T 



(Xi ^ X2 V X3) ^ (Xi . X3) 
F 
F 
F 
T 
F 
T 
F 
T 



This demonstrates that the statement is satisfiable. It is satisfiable for the three 
truth table settings 



= T, X2 = X3 = F 
Zi = F, X2 = T, Z3 = F 
Xl =X2 = X3 = F 

It can be seen that the number of rows in such a truth table will be 2", where n is 
the number of literals. 2" grows exponentially rendering such an approach impracti- 
cal for other than small problems. We present practical alternatives (although in the 
‘worst-case’ satisfiability problems are of exponential complexity, as discussed in 
Section 2.4). 



4.1 Resolution and Absorption 

In order to carry out a more efficient procedure we write a logical statement in 
conjunctive normal form (CNF). Therefore we consider statements written as a 
conjunction of disjunctive clauses. We will use the following, larger, example. 
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Example 4.2 Is the following statement satisfiable? 

Y[y X^y Xa, (4.8) 

V X 3 V X 4 (4.9) 

■X 1 VX 3 (4.10) 

■XlyXA, (4.11) 

■YlyX^yXl (4.12) 

■Xiv^ (4.13) 

■^vX 3 (4.14) 

■X 2 VX 4 (4.15) 

■X 3 VX 4 (4.16) 



i.e. we have a conjunction of the (disjunctive) clauses which are written on each line. 

If it turns out that a conjunction of clauses, such as that above, is not satisfiable 
then there is also interest in finding the maximum number of clauses which are, 
together, satisfiable. This is known as the maximum satisfiability problem. We will 
show how to solve it as an IP model in Sect. 4.5. 

Before solving Example 4.2, in Sect. 4.2, we define two procedures needed. 

Resolution consists of choosing two clauses which have exactly one literal which 
is negated in one clause and unnegated in the other (they may or may not have other 
literals in common, but if so they must have the same sign). 

An example is provided by clauses (4.9) and (4.12). 

^vA 3 VX 4 and Y[yXj,yYl (4.17) 

Two such clauses can be combined by forming their disjunction and deleting the 
common literal (in this case X 4 ) which is negated in one and unnegated in the other. 
The resultant clause is referred to as their resolvent. In the case of (4.17) we obtain 
the resolvent 



X 1 VX 2 VA 3 (4.18) 

Clearly the resolvent is implied by the original two clauses since the eliminated 
common literal (X 4 ) is either true or false. If it is true the remaining part of one of the 
clauses is true and if it is false the remaining part of the other clause is true. Hence 
either way the disjunction of them is true giving the resolvent, i.e. the resolvent is 
an implication of the original statement. 

Absorption involves looking for clauses that contain all the literals of another 
clause and the common literals have the same sign. The smaller clause is said 
to absorb the larger. For example (4.16) absorbs (4.8). Clearly the smaller clause 
implies the larger which may therefore be removed (obviously if there are duplicate 
clauses either one can be removed). 
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To simplify a logical statement in CNF we repeatedly apply absorption and reso- 
lution, appending the resolvent to the set of clauses and removing absorbed clauses, 
until no further simplification is possible. The result is a ‘simpler ’ but equivalent 
statement. 

Applying these procedures to the example we finally obtain the conjunction 



X2 ■ X4 ■ (Xi V Z3) ■ (Xi V Xi) 



(4.19) 



The individual clauses obtained (in this example X2, X4, X\ V X3, X\ V X3) are 
known as prime implications. They are the smallest (i.e. least number of literals) 
disjunctive clauses that are implied by the original statement. 

This procedure delivers us the complete sum of prime implications, i.e. every 
possible minimal clausal prime implication. All other clausal implications would be 
absorbed by a prime implication (and therefore not be prime). 

In order to show this suppose we were to have a disjunctive clause C which is 
implied by the original statement and is the longest clause not absorbed by any of the 
prime implications. We can, without loss of generality, assume that all the literals 
in C are unnegated (by, if necessary, replacing the negated literals throughout by 
unnegated ones). Therefore let 



C = Zi V A2 V • • • V Xr 



(4.20) 



In order for C to be false X\ — X2 — ■■■ — Xr — ¥ . Therefore C cannot 
contain all the literals in the original statement as the only way for the original 
statement to be false would be for all the literals in the prime implications to be 
unnegated. This would cause them each to absorb C. Therefore C must not contain 
some variable, say A, . In this case, since C is the longest clause not absorbed A,- V C 
and A, V C which are both implied must be absorbed by prime implications. The 
prime implication which absorbs A, V C must contain A, otherwise it would absorb 
C. The prime implication which absorbs A, V C must contain A, for a similar 
reason. The resolvent of these two implications would be a prime implication which 
absorbs C contradicting the definition of C. 

The complete sum of prime implications is not, however, necessarily the simplest 
equivalent statement. There might be a proper subset of the prime implications, the 
conjunction of which is still an equivalent statement. The finding of the minimal 
equivalent set of prime implications is, however, another (difficult) problem. We 
address this problem in Sect. 4.6. For the small example above, the four derived 
prime implications also form the minimal set. However, we will now use resolution 
and absorption, together with a branching procedure, for the more modest aim of 
determining if a statement in CNF is satisfiable. 




4.3 Representation as an Integer Programme 



109 



4.2 The Davis-Putnam-Loveland (DPL) Procedure 

If a statement is not satisfiable then the empty clause is a prime implication (which 
absorbs all other implications rendering it the only one) and this will be obtained by 
successively applying resolution and absorption. Otherwise, if the empty clause is 
not obtained, then the statement is satisfiable. A satisfiable set of truth values can be 
obtained as follows: 

1 . (a) If a clause consists of a single literal X then set A = T. Similarly if a clause 
consists of a single literal A set A = F. Note: There will be no other clauses 
containing A or A by the application of absorption. 

(b) If a literal is unnegated in all clauses within which it occurs set A = T and 
delete the clauses within which it occurs. (There may be alternative satisfy- 
ing settings but this is sufficient if we are content only to find a satisfiable 
setting.) Similarly if a literal is negated in all clauses within which it occurs 
set A = F and delete all clauses within which it occurs. (Again there may be 
alternatives.) 

(c) If a literal A is unnegated in some clauses and negated in others partition the 
clauses into two sets (i) and (ii). 

(i) Set A = T and delete the clauses in which it is unnegated. 

(ii) Set A = F and delete the clauses in which it is negated. 

Each of the sets (i) and (ii) after this ‘branching’ is then treated as before by 
procedures (a), (b) and (c). It can be shown (Exercise 4.12.3) that at least one of 
the branches will result in a satisfiable set of truth values. 

For, comparatively simple. Example 4.2 resolution and absorption resulted in 
(4.19): (a) sets X 2 = F, A 4 = T; (b) does not apply and (c) creates the settings 
(branching on, say, Ai): (i) Ai = T, A 3 = T and (ii) X\ = F, A 3 = F. 

After applying (b) we obtain the alternative sets of satisfying truth values 



Ai = T, 


^2 


= F, 


A 3 


= T, 


A 4 = T 


(4.21) 


Ai = F, 


2f2 


= F, 


^3 


= F, 


H 

II 


(4.22) 



4.3 Representation as an Integer Programme 

The satisfiability problem can be represented as the problem of deciding if an integer 
programme (IP) is feasible using the modelling methods described in Chapter 3. 
However, the satisfiability problem has a special structure as an IP which should be 
explained. 

If the satisfiability problem is expressed in CNF then each (disjunctive) clause 
gives rise to a constraint. As in Chapter 2 we represent the truth or falsity of a literal 
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Z by a 0-1 variable x taking the value 1 or 0, respectively. If X occurs unnegated in 
a clause then x appears in the constraint. If it occurs negated then (1 — x) appears. 
The sum of these terms in the constraint must be ^ 1. 

For example the clause 



Zi V Z 3 V Z 4 (4.23) 

gives rise to the constraint 

(1 - xi) + X 3 + X 4 ^ 1 (4.24) 

In more usual form this is written as 

— xi + X 3 + X 4 ^ 0 (4.25) 

For Example 4.2 the full set of constraints is therefore 



— xi X 3 -h X 4 ^ 0 


(4.26) 


-X 2 -h X 3 -h X 4 ^ 0 


(4.27) 


xi — X 3 ^ 0 


(4.28) 


— X 3 -h X 4 ^ 0 


(4.29) 


— Xi -h X 3 — X 4 ^ — 1 


(4.30) 


-Xi - X2 ^ -1 


(4.31) 


-X 2 -1- X 3 ^ 0 


(4.32) 


X 2 -h X 4 ^ 1 


(4.33) 


X 3 -h X 4 ^ 1 


(4.34) 



The general form of such constraints is 



r r-\-s 

- ^X; + ^ X; ^ (1 - r) (4.35) 

i=l ?=r+l 

Models with such constraints are a special case of generalised set-covering 
problems (GSCPs). (GSCPs, in contrast to the set-covering problem mentioned in 
Chapter 2, have all coefficients 0, ± 1 and general integer right-hand sides, with all 
constraints ‘^’.) This special sort of GSCP can easily be converted to a set-covering 
problem (SCP) where all the coefficients and right-hand sides are 0-1 by substitut- 
ing (1 — y, ) for X, in the first summation in (4.35). This then gives the constraints 



r r+^ 

i=l ?=r+l 



(4.36) 
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Together with the extra constraints 

= i = (4.37) 

If (4.37) were converted to the form then solutions satisfying the ‘=’ form 
could be sought by the choice of suitable objective function such as 



Minimise Y.^Xi+yd (4.38) 

i=\ 

If this objective can be minimised to r then (4.37) must be satisfied and the 
original problem is satisfiable. 

However, for our purposes it suffices to consider satishability problems in the 
form of GSCPs. 



4.4 The Relationship Between Resolution and Cutting Planes 

We could solve the problems described above as IPs using the branch-and-bound 
method described in Chapter 2. However, it is worth pointing out that we can often, 
with benefit, reduce the size of such models first. The methods of reduction we 
use mirror resolution and absorption as described in Sect. 4. 1 . We illustrate this by 
reference to constraints (4.8) and (4.16). 

If one constraint is implied by another it can be removed. This happens when one 
contains a subset of the variables of another with the same signs on the variables and 
a relaxation of the right-hand side. For example 

X 3 -h X 4 ^ 1 clearly implies — x\ + Q (4.39) 

i.e. (4.16) implies (4.8). This mirrors the absorption of Zi V X 3 V X 4 by X 3 V X 4 . 

Corresponding to the resolution operation we can add together two constraints 
which contain a variable which is negated in one constraint and not in the other. 
If all the other common variables have the same sign the resultant constraint is of 
significance. Consider the two constraints (4.27) and (4.30). Adding them together 
produces 



— X\ — X 2 + 2X3 ^ — 1 



(4.40) 



There is also 
form) —x\ ^ — 1 



no loss of generality in adding in the conditions (in negated 
and —X 2 ^ — 1 in order to make all the coefficients ±2 giving 



— 2xi — 2x2 + 2x3 ^ 



(4.41) 
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Since the quantity on the left-hand side is a multiple of 2 we can divide through 
by 2 (this is one of the operations for creating the Chvatal ‘dual’ as described in 
Chapter 2) to produce 



3 

- Xi - X 2 + X 3 ^ - (4.42) 

and round up the right-hand side to give 

- xi - X 2 + X 3 ^ -I (4.43) 

This procedure mirrors resolution applied to X 2 V X 3 V X 4 and X\ V X 3 V X 4 
to produce Xi V X 2 V Z 3 which is represented by the constraint (4.43). 

The above procedure, applied to constraints, is very significant. Not only we are 
adding constraints to produce a new constraint but we are also applying a rounding 
operation. The result of this procedure is a Rank 1 Cutting Plane as explained in 
Chapter 2. It therefore cuts off solutions to the linear programming (LP) relaxation 
but does not cut off any integer solutions. Hence the reduced model may well have 
a tighter LP relaxation than that of the original model. This, almost certainly, makes 
the subsequent solution by branch-and-bound shorter. 

It might be thought that there is an exact relationship between prime implications 
(the smallest implied clauses) and IP facets (the tightest constraints) as discussed in 
Chapter 2. This is not, however, the case (see Exercise 4.12.9). 

Exercise 4.12.7 involves showing that, in general, the procedure above exactly 
mirrors resolution. Exercise 4.12.8 involves showing that if one adds constraints 
with more than variable differing in sign no rounding (and hence strengthening) 
applies (although the resultant constraint is still valid). 

We now apply branch-and-bound to the set of constraints which results from 
reducing (4.26) to (4.34) by using the constraint forms of resolution and absorption. 
This corresponds to the logical statement (3.19) which was obtained after logical 
resolution and absorption applied to the original, logical, statements (3.8)-(3.16). 
The corresponding set of reduced constraints is 

— X 2 ^ 0, X 4 ^ 1, —xi + X 3 ^ 0, xi — X 3 ^ 0 (4.44) 

Together with the restriction that these variables be 0-1 and the (arbitrary) 
objective 



Minimise xi -|- X 2 -h X 3 -|- X 4 (4.45) 

we obtain the LP relaxation solution 



xi = 0, X2 = 0, X3 = 0, X4 = 1 



( 4 . 46 ) 
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This is clearly integral in this, very simple, case demonstrating that the original 
(logical) statement was satishable. In general we may obtain a fractional solution 
and have to resort to branch-and-bound. 

Notice that by choice of objective (4.45) we have produced only one of the two 
solutions corresponding to (4.22). The choice of objective can be arbitrary if we are 
only seeking to prove satisfiability. If we require all satisfying solutions then we 
have the much more difficult task of finding all feasible solutions to the IP. We do 
not address this task here. 



4.5 The Maximum Satisfiability Problem 

In order to illustrate further the use of IP we consider an important extension of the 
satishability problem and illustrate it by an example. 

Example 4.3 Is the following statement satishable and if not what is the maximum 
number of clauses which make it satishable? 



V X\2 


(4.47) 


X 2 \ V X 22 


(4.48) 


X31 V X32 


(4.49) 




(4.50) 


X\i V X31 


(4.51) 


X21 V X31 


(4.52) 


X \2 V X 22 


(4.53) 


X \2 V X32 


(4.54) 


X 22 V X32 


(4.55) 



This is a tiny example of the so-called ‘pigeon hole’ problem. This is the problem 
of seeing if it is possible to ht n+ 1 objects into n boxes with no more than one object 
in each box. Clearly this is not possible when looked at semantically. However, to 
prove that it is not possible syntactically (if we did not know the interpretation) is 
very difficult. It can be shown (Exercise 4.12.10) that, for the general problem, the 
number of steps needed to do this by resolution is an exponential function of n. 

For the above example we interpret Xij as meaning ‘object i is put in box j\ 
Clauses (4.47)-(4.49) impose the condition that each i must be put somewhere. 
Clauses (4.50)-(4.55) impose the condition that no more than one object can be put 
in each box j . Here we have three objects and two boxes. 

It is convenient to illustrate our application of resolution by means of the tree 
in Fig. 4.1. 
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□ 



Fig. 4.1 A resolution tree 



At each level we apply resolution. For example at level 1 we resolve clauses 
(4.47) and (4.50) to obtain Z 12 VX 21 which is then resolved with (4.48). Proceeding 
in this way we ultimately resolve Z 31 with Z 31 to produce the empty clause 
represented by ‘D’ showing that the original statement is not satisfiable. 

Note that at the end both X 31 and Z 31 are ‘non-input’ clauses, having been 
derived in the course of the calculation. This is of importance in distinguish- 
ing between ‘easy’ and ‘difficult’ problems and is addressed in Sect. 4.6 and in 
Exercise 4.12.1 1. 

Given that the statement in Example 4.3 is not satisfiable, what is the maximum 
number of clauses which can simultaneously be made satisfiable? This is an example 
of the maximum satisfiability problem. 

If we represent a general (disjunctive) clause as a constraint among 0-1 variables 
in the form of (4.35) (indexed as constraint j) we can force its satisfaction by a 0-1 
variable yj taking the value 1. In order to do this we write the constraint as (4.56) 
using the modelling methods discussed in Chapter 3. 



r r+5 

“ + Z ~ yj ^ 

i=l i=r+l 



(4.56) 
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By employing the objective 



Maximise E yj (4.57) 

j 

we maximise the number of clauses which are satisfied. 

We apply this model to Example 4.3 to give Maximise 

y\ + yi + + jA + y$ + ye + yi + y9 + T9 (4.58) 

subject to 



Til 


+ 


T12 


— 


yi 




0 


(4.59) 


T21 


+ 


T22 








0 


(4.60) 


T31 


+ 


T32 








0 


(4.61) 


Til 


+ 


T21 


■ 


T4 




2 


(4.62) 


Til 


+ 


T32 


+ 


T5 




2 


(4.63) 


T21 


+ 


T31 


+ 


ye 




2 


(4.64) 


T12 


+ 


T22 


+ 


yi 




2 


(4.65) 


T12 


+ 


T32 


+ 


ys 




2 


(4.66) 


T22 


+ 


T32 


+ 


y9 


< 


2 


(4.67) 




X 


ij€{0, 


1 } 









The LP relaxation produces a fractional solution. Proceeding to an optimal IP 
solution the maximal objective value is 8, showing that the model is not satisfiable. 
Among the alternative optimal integer solutions is 



Til = X22 — X22— 1, Ti 2 = T21 = X31 = 0 (4.68) 

yi = T2 = T3 = T4 = T5 = T6 = W = ys = 1, T9 = 0 (4.69) 



i.e. we can satisfy at most eight clauses. 

(This has the ‘useless’ interpretation that we can put object 1 in box 1 and objects 
2 and 3 in box 2 so satisfying the conditions that every object must be put somewhere 
but breaking one of the conditions that no more than one object can be put in any one 
box.) As a variant of the maximum satisfiability problem we might weight clauses 
differently according to their size. For example we might prefer smaller clauses to 
larger ones. This could easily be accomplished by giving different weights to the 
variables in the objective. 

Although resolution and absorption produce the minimal size implied clauses 
their complete sum may contain redundancies, i.e. the conjunction of a proper subset 




116 



4 The Satisfiability Problem and Its Extensions 



of the prime implications may be equivalent to the original statement. We illustrate 
this by an example. 



4.6 Simplest Equivalent Logical Statement 

Example 4.4 What is the minimum number of prime (disjunctive) clauses whose 
conjunction is equivalent to the following statement? 



Xi V X2 V X4 


(4.70) 


X\ V X2 V X4 


(4.71) 


YiVY~4V X5 


(4.72) 


YiVY~4vY~5 


(4.73) 


^ V X 3 V A 5 


(4.74) 


^ V ^ V A 5 


(4.75) 


X2 V X3 V X4 


(4.76) 


Xi V 2^2 V A3 


(4.77) 


A 3 V A 4 V A 5 


(4.78) 


Ai V A 3 V A 5 


(4.79) 



Applying resolution and absorption successively delivers the following set of 
prime implications, giving the following equivalent statement: 



Ai V A 4 


(4.80) 


■X(v Aj^ 


(4.81) 


■^V A 5 


(4.82) 


■A 3 V A 5 


(4.83) 


Ai V A 2 V A 3 


(4.84) 


X2 V X 3 V X4 


(4.85) 



There may be a proper subset of these prime implications, the conjunction of 
which is also equivalent to the original statement. We give an IP formulation later. 
We can solve this example by means of a truth table with 32 rows. It is sufficient 
to find a subset of the clauses the truth of which implies the truth of all the clauses. 
(Implication the other way follows from the fact that the clauses are already prime 
implications.) Therefore we need to hnd a subset, the truth of which implies the 
truth of the other clauses. 

It can be shown that the hrst hve clauses imply the sixth. Also the first four and 
the sixth clause imply the hfth. No smaller subset implies the others. Therefore there 
are two shortest equivalent statements. They are 
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(Xi V X4) ■ (Xi V X 4 ) ■ (Z 2 V Xs) ■ (X 3 V X 5 ) ■ (Zi V X 2 V X 3 ) ( 4 . 86 ) 

(Xi V X 4 ) ■ (^ V ^) ■ (^ V Xs) ■ (X 3 V X 5 ) ■ (Z 2 V X 3 V ( 4 . 87 ) 

In practice heuristics are often used to find, hopefully good, but suboptimal solu- 
tions to this problem. However, in this book, we concentrate on exact methods. 
Figure 4.2 demonstrates the dependencies of the prime implications. 



X1VX4 



X1VX4 



t 




i 



X4 V X2 V X3 



Xj VX3 VX4 




X2VX5 



X3 VX5 



Fig. 4.2 Dependency between prime implications 



If a clause has arrows leading into it then it is the result of successive resolutions 
applied to the clauses from where the arrows originate and the resultant clauses. It 
is therefore redundant when all the clauses at the origins of the arrows are present. 
Notice, therefore, that Z1VX2VX3 is redundant if Z2VX3VX4 is present, being 
the resolvent of this clause and Xi V Z4. However, X2 V X3 V X 4 is the resolvent 
of Xi V X2 V X3 and X\ V X 4 . Therefore Xi V X2 V X3 and X2 V X3 V X 4 are 
each only redundant if the other is present. 

We can use the dependency diagram to give an IP formulation of the problem 
of finding the minimum sum of prime implications. We introduce 0-1 variables y,- 
which are 1 if the corresponding clause is present. For each clause we keep track 
of which sets (if any) of prime implication clauses give rise to it by successive 
resolutions applied to them. Then this clause is redundant if any of the origin sets of 
clauses is present. Each origin set of clauses {ii, *2, . . . , ir) is represented by a 0-1 
variable z,i,, 2 ,...,j,.which takes the value 1 if all the clauses in the set are present. 

Clause i may have a number of alternative origin sets , Sf, . . . , Sf . In order to 
avoid ignoring any origin set we must suspend the absorption operation when using 
resolution for this purpose. Clause i can only be removed if all the clauses in one of 
the origin sets is present. Therefore we stipulate 

Ji + ^ Zs> > 1 for all i (4.88) 

i 

If a clause has no origin sets then it must be present and there is no term. 

In order to ensure that if Zi,j 2 ,-Jr — 1 clauses ii, ( 2 , ■ . ■ ,ir are present we have 
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Zh.h.-,h - Jij < 0 for all ij (4.89) 

The objective 

Minimise y,- (4.90) 

i 

enables the minimum sum of prime implications to be found. 

For our (simple) example we have the model 



Minimise yi + y 2 + ys + M + Ts + Te (4.91) 

Ti. yi, T3, T4 ^ 1 (4-92) 

T5 + z\,6 > 1 (4.93) 

ye + 22,5 ^ 1 (4.94) 

21,6 - Ti < 0 (4.95) 

21,6 - T6 < 0 (4.96) 

22,5 - T2 < 0 (4.97) 

22,5 - T5 < 0 (4.98) 



We can simplify a statement in DNF to produce its prime impUcants using the 
(logical) dual of resolution. A prime implicant is a conjunctive clause which implies 
the original statement and is such that no smaller clause containing a subset of the 
literals with the same sign implies the statement. The corresponding operation to 
resolution is known as consensus. A consensus of two (conjunctive) clauses with 
overlapping literals of the same sign, except for one literal, is obtained by taking 
their conjunction and deleting the common literal which is negated in one and 
unnegated in the other. For example the consensus of 



Xi-X 2 - V Z2 • X3 • A4 



(4.99) 



is 



Ai-A2-Z4 (4.100) 

The consensus clearly implies the pair of clauses from which it is derived. It is 
added to the set of clauses. 

We also apply absorption. If one conjunctive clause contains literals (of the same 
sign) which are a subset of the other then the larger clause is redundant and is 
removed. If we successively apply consensus and absorption to a statement in DNF 
we obtain the disjunction of all the prime implicants. This statement is equivalent 
to the original statement. Note the distinction between simplification in DNF and 
CNF. In CNF we obtain prime implications but in DNF we obtain prime implicants. 
But analogous to the CNF case, the disjunction of a proper subset of these prime 
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implicants may also be equivalent to the original statement. In order to illustrate this 
let us consider the negation of the statement in Example 4.4. Using De Morgan’s 
laws (as described in Chapter 1) this is naturally written in DNF by interchanging 
and ‘v’ and changing the sign of each literal. Applying consensus and absorption 
delivers us the logical dual of (3.80)-(3.85), i.e. 



(Xi ■ X4) V (Zi ■ X4) V (X 2 ■ X 5 ) V (A 3 ■ As) V (Ai ■ A 2 ■ A 3 ) V (A 2 ■ A 3 ■ A 4 ) (4. 101) 

The minimum disjunction of prime implicants, giving an equivalent statement, 
consists of either the first hve or the first four and the sixth of the above conjunctive 
clauses corresponding to the duals of (3.86) and (3.87). 

We have confined our attention to minimising the number of clauses in an equiv- 
alent statement (whether in CNF or DNF). Alternatively we might wish to minimise 
the total number of literals in the equivalent statement. In order to do this we would 
weight clauses according to their size in the minimisation. For the example this still 
results in (4.86) and (4.87). 

It will clearly never be the case that the statement using the minimum number of 
literals does not use prime implications (or prime implicants in the DNF case). 

CNF and DNF are only two of the normal forms which we might wish to 
minimise. As discussed in Chapter 1 there are many ways to represent a logical 
statement as well as different connectives which can be used. In practical terms 
the relevant form to use depends on the problem being considered. For example 
in Sect. 4.10 we discuss different forms of logical circuit design where we might 
wish to minimise the number of switches or gates, or some other aspects of the 
architecture. 



4.7 Horn Clauses: Simple Satisfiability Problems 

The satisfiability problem and its extensions are very difficult problems to solve in 
the worst case (they are NP hard) whether they are solved by logical or IP methods. 

However there are special cases that are comparatively easy to solve. One such 
is when all the clauses are Horn clauses. These are statements which can be written 
in the form 



Ai ■ A2 ■ A3 



A4 



(4.102) 



or 



Ai-A 2 -A 3 — ^T (4.103) 

i.e. they can be written as implications where the premises are conjunctions of 
unnegated literals and the consequents are either single unnegated literals or tau- 
tologies (always true). Written as disjunctive clauses (4.102) and (4.103) are 
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X 1 VZ 2 VX 3 VX 4 (4.104) 

and 

(4.105) 

Hence Horn clauses, expressed as disjunctions, contain at most one unnegated 
literal. This makes such systems particularly easy to solve (the number of steps is 
always a polynomial function - which can be made a linear function - of the size of 
the problem). A number of logic programming systems restrict themselves to rules 
of this form for this reason. Some systems that are not Horn can be made such by 
a simple transformation of variables (e.g. replacing unnegated literals by negated 
ones, throughout the system, or vice versa). Such systems are called essentially 
Horn. 

We illustrate the situation by means of the following example. 

Example 4.5 From the following set of Horn clauses is it valid to deduce X 2 ? 



V A 2 V A 3 


(4.106) 


■ Zi 


(4.107) 


■ A 4 


(4.108) 




(4.109) 



As explained, at the beginning of this chapter, we negate the conclusion, append- 
ing 



■A 2 (4.110) 

as an extra Horn clause and test the satisfiability of the system. If it is not satisfi- 
able then the conclusion X 2 is valid. (Alternatively we could deduce all the prime 
implications of the original system to see if X 2 is one of them.) 

If we apply resolution to the system we obtain the tree in Fig. 4.3. 

Notice that one of the two clauses in each resolution operation is one of the input 
clauses. This is in contrast to Fig. 4. 1 where the last resolution was between two 
non-input clauses. If each resolution involves an input clause it is only necessary to 
store them and the current resolved clause. This leads to a reduction in the amount 
of computation to a polynomial number of steps. (In fact it can be made a linear 
number of steps.) In order to see why input resolution is sufficient for delivering all 
the prime implications of a Horn clause system we consider the following (totally 
general) example of two successive resolutions illustrated in Fig. 4.4. 

Cl, C 2 , C 3 , C 4 are disjunctive clauses and the input clauses are at the top level. 
In order for the system to be Horn and for the resolutions to be possible we stipulate 
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Xi V X2 V X3 Xi V X2 




X2VX3 X1VX3VX4 




Xi V X2 V X4 X2 V X4 




X1VX2 X1VX2 





□ 



Fig. 4.3 A resolution tree for a Horn system 



Cl VXi 



C2 V V 




Cl V C2 V 






Cl V C2 V C3 V C 4 



Fig. 4.4 Non-input resolution in a Horn system 



that all the literals in C\, C 2 , C3 and C4 are negated. (Some literals may be shared 
in common.) 

At the top level we eliminate X\ and X 2 , respectively, and at the next level we 
eliminate X 3 . 

Observe that Fig. 4.4 illustrates a non-input resolution since both clauses at the 
second level are non-input. But it is possible to derive the same result by the input 
resolution illustrated in Fig. 4.5. 

Therefore we can restrict ourselves to input resolution when we have Florn sys- 
tems. Another feature of Horn systems, worth remarking on, is that they are closed 
under resolution. Each resolved clause is itself Horn. 

It also turns out that the corresponding IP model for a Horn system is partic- 
ularly easy to solve. We illustrate the combining of constraints in such a model, 
corresponding to Example 4.5, in Fig. 4.6. 

In order to simplify the presentation we do not explicitly include the constraints 
Xi ^ 0 and — x, ^ — 1 which are added, where necessary, to make coefficients equal. 
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Cl V Ca V Xi V X3 




Cl V C2 V C3 V C4 

Fig. 4.5 Input resolution in a Horn system 

— Xi~ X 2 + X^>=—1 —Xi—X2>=0 




-2x2 + 2x3>=-1 2{-Xj-X3-X4^>=-2) 




- 2xi — 2x2 — 2x^>= —5 2{ — X2 + x^>=0) 




— 8 x 2 >= — 7 



Fig. 4.6 Input resolution for an equation system 



The final constraint, after rounding, clearly demonstrates that X 2 — 0. 

Figure 4.6 also shows that, with input resolution applied to constraints, we can 
postpone rounding until the end. Consider a constraint representing a Horn clause. 



n 

- +X„+i ^ I - n 

i=l 



(4.111) 
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(the variable x„+i may or may not be present). After an application of resolution we 
still obtain a Horn clause which, with rounding, would produce a clause of the form 
(4.1 1 1). Therefore before rounding it would take the form 



n 

— Xi + x„+i ^ 1 — n — e (4.112) 

1=1 



where 



0 < e < 1 

A subsequent resolution, involving (4.112), would also involve an input clause 
with an integer right-hand side. These clauses, when added together, and divided 
by a positive integer p (the left-hand side common coefficient) would result in a 
right-hand side of the form 



N-- (4.113) 

P 

The result, after rounding, must again be a Horn clause of the form (4.111). 
Hence postponing rounding until the end still allows us to deduce a constraint rep- 
resenting the final conclusion. 

It is therefore a feature of the IP formulation of Horn systems that they can be 
solved as LPs together with a final rounding to the optimal objective value. Since 
there exist polynomially bounded algorithms for LP this again demonstrates the 
computationally easier nature of Horn systems. The feasibility (satisfiability) of a 
system can therefore be tested by LP. A system will be satisfiable if and only if the 
LP relaxation of the corresponding IP model is feasible. 

Here we are effectively applying (the LP form of) Fourier-Motzkin elimination 
as explained in Chapter 2. 



4.8 Constraint Logic Programming 

Constraint logic programming (CLP) is, in many ways, better called constraint satis- 
faction and uses relatively little ‘logic’. It is an alternative approach to solving many 
of the problems discussed in this book. It is sometimes, but not always, a faster 
method. Although it is not based on the methods of LP and IP some of the oper- 
ations (e.g. branching) used are similar to those of IP. Since it does not rely on 
the rich computational methodology of LP it lacks the mathematical sophistication 
of LP and IP. However, it is much richer in its modelling capabilities and more 
flexible in its solution strategies. These aspects are discussed in subsequent parts of 
this section. One of the reasons it proves powerful for some problems is the rather 
mundane one that computers have become very fast. Even complete enumeration 
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has become viable for some problems. CLP has many features in common with the 
reduce (presolve) procedures now commonly used to preprocess models. 

CLP is not designed as an optimisation method although it can be adapted to such 
by making the objective, with progressively tighter bounds, a constraint. 

So far there has been comparatively little success in combining the two 
approaches into ‘hybrid’ systems although there have been a number of designs 
for such systems. Some of these are discussed subsequently. 

CLP can be regarded as a ‘procedural’ language as it is possible to model condi- 
tions, in the statement of the problem, which help to direct the computational search. 



4.8.1 Modelling in CLP 

LP and IP restrict the modelling to linear expressions of the form and ‘=’. 

In many situations it is convenient to model many more conditions. This is usually 
done in CLP in the form of predicates. Of course it is possible to convert these pred- 
icates into conventional IP constraints but the conversion is often cumbersome. We 
discuss some of the most widely used predicates below. These are usually referred to 
as global constraints as they are in-built into the computer software used. In contrast 
it is also usually possible to define local constraints specific to the problem being 
modelled. These constraints may serve the purpose of modelling specific conditions. 
They may also serve the purpose of directing or restricting the search for solutions 
in a way which will speed the computation. For example constraints that rule out 
symmetrically equivalent solutions are often highly effective. 

We describe some of the most commonly used predicates below (which may take 
different names in different software systems). There are many more which can be 
found in the manuals of the relevant software systems. In all cases we are assuming 
the variables are restricted to a finite number of discrete values. 



4.8.1. 1 The ^ Constraint 

This may take the form 



ajXj ^ b (4.114) 

J 

in contrast to LP and IP constraints. Alternatively it may take the form 



T ^ {Vi, V2, V„] 



(4.115) 



so preventing x taking one of the specified values. We could model (4.114) using 
IP (see Section 3.1) as 
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— (M — b + s)S < b — s (4.116) 

jj 

ajXj + (m — b — s)S ^ m (4.1 17) 

j 

where M and m are, respectively, known upper and lower bounds on ajXj and 
e is a ‘tolerance’ above or below which the quantity is not regarded as equal to b. 

We emphasise, however, that (4.114) is an alternative, less cumbersome, way of 
modelling the condition. 

Ways of modelling some of the other predicates below are left as exercises in 
Sect. 4.12. 



4.8.1.2 The ‘All-Different’ Constraint 

This condition frequently arises in practical problems and is conveniently written as 

alLdiff(x\,X2,---,Xn) (4.118) 

It means that all the x, must take different values. 

For example assignment conditions that arise as part of many problems (apart 
from the ‘pure’ form of the ‘easy’ classical assignment problem mentioned in 
Chapter 2) can be modelled by (4.118). The value of x, is interpreted as the name 
(number) of the position to which i is assigned and (4.1 18) prevents more than one i 
being assigned to any position. 



4.8.1.3 The Cardinality Constraint 

This condition can be written as 

cardm{x\, X2, ■ ■ ■ , Xn\v) (4.119) 

It means that exactly m of variables xi , X 2 , . . . , x„ must take the value v. 



4.8.1.4 The ‘At-Least’ Constraint 

This can be written as 



atJeastm{x\, X 2 , . . . ,x„\v) (4.120) 

It means that at least m of variables xi, X 2 , . . . , x„ take the value v. 

It was used as a modelling predicate in Sect. 3.5 in the form ‘greater-than-or- 
equaV (‘ge’) where the variables were Boolean and ‘v’ represented ‘True’. There 
it was allowed to take a nested form. 
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4.8.1.5 The ‘Circuit’ Constraint 

This can be written as 



circuit X 2 , ■■■, Xn) (4.121) 

where (xi, X 2 , ■ ■ ■ , x„) is a permutation of integers 1, 2, . . . , n. x, is the number 
of the integer in the circuit after i. For example circuit(5, 4, 1, 3, 2) represents the 
circuit shown in Fig. 4.7. 




Fig. 4.7 A circuit 

But it can easily be checked that the instantiated predicate circuit(6, 4, 7, 1, 3, 
8, 5, 2) is false, i.e. the given permutation of numbers makes the predicate false as 
it does not represent a circuit (these are the ‘subtours’ illustrated in Fig. 2.16). 

This predicate gives, in particular, a very compact formulation of the travelling 
salesman problem (TSP) (although the TSP is not efficiently solved by CLP). 



4.8.2 Solving CLP Models 

In the form of CLP which we are describing each variable in a model will have a 
finite domain of discrete values. These may be numbers or names, e.g. 

X e {2,5,6, -1,0} (4.122) 



or 



town e {Lewes, Edinburgh, Truro) (4.123) 

Constraints (possibly in the form of predicates) will restrict the values which sets 
of variables can together take, e.g. suppose 
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x\, X 2 , Xi e {I, 2, 3} 


(4.124) 


all-diff(xi, X 2 , X 2 ) 


(4.125) 


2x\ -h 3 x 2 -|- X 3 ^ 10 


(4.126) 


Since (4.126) applies we can restrict the domains of some 
subsets of {1,2,3}. 

Since 


of the variables to 


2xi ^ 10 — 3xi — X 3 ^ 10 — 5 = 5 


(4.127) 


we must have xi ^ 2 restricting its domain to ( 1 , 2 |. 
Similarly since 




3 x 2 ^ 10 — 2 xi — X 3 ^ 6 


(4.128) 


we have X 2 ^ 2 restricting its domain to ( 1 , 2 |. 

Therefore, since xi and X 2 empty the domain (1, 2|, X 3 must take the value 3. 
This implies 


2 xi + 3 x 2 ^ 7 


(4.129) 


Hence 




3 x 2 ^ 7 — 2xi ^ 5 


(4.130) 



Therefore X 2 ^ 1 forcing X 2 = 1 and xi = 2. 

The above type of argument is known as constraint propagation, i.e. using con- 
straints, in conjunction with others, to restrict the domains of variables. 

As mentioned earlier such an approach is very similar to the presolve proce- 
dure found in many traditional IP systems to ‘simplify’ models before solving them 
by more sophisticated LP/IP methods. In CLP (and IP) this can be extended to a 
dynamic presolve where, after setting certain variables to trial values, in a tree search 
one applies the approach to restrict the domains of subproblems down particular 
branches of the search tree. 

After applying constraint propagation to restrict domains one is generally forced 
to try setting variables to values within their domains using a branching procedure. 
Then the implications of different trial values will be found, often demonstrating the 
impossibility of certain solutions (empty domains for some variables). 



4.8.3 Hybrid CLP and IP systems 



While CLP offers flexibility in modelling and search strategies IP relies on the 
strength of the LP (and possibly other) relaxations. There have been a number of 
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proposals to combine the two approaches into ‘hybrid’ systems, most of which have 
not been fully realised. We outline possible ways of doing this. 

4.8.3.1 Modelling in CLP to Produce IP Constraints 

This approach uses the rich modelling capabilities of CLP to state the problem. 
The predicates (constraints) are then converted into traditional IP constraints. One 
powerful, universal, way of doing this using the ‘ge’ predicate was described in 
Sect. 3.5. 

4.5.3.2 Transference of Information Between Two Systems 

Here the idea is to solve a model using CLP and IP in parallel with exchange of 
information between the systems, e.g. CLP would provide restrictions on domains 
(akin to a dynamic presolve) while IP would provide increasingly tighter bounds on 
the objective by means of the LP relaxation (which could, itself, be used for further 
domain restriction). 

If a model has a natural decomposible structure CLP can be used to solve sub- 
problems whose solutions are incorporated into a master problem which is solved 
by IP. 

There has been some success with this approach in producing special purpose 
methods for particular types of application. For example column generation is a 
method of solving important scheduling problems (particularly crew scheduling in 
the airline industry). Here columns (‘rosters’) are generated in the course of opti- 
mising a large 0-1 set-covering IP. Rather than generate the, potentially astronomic 
number of, columns all at once they are usually added in the course of optimisation. 
Their generation may effectively be done using CLP. The cutting stock problem 
also yields a model amenable to column generation. Here patterns for cutting up 
rolls of material can be generated in the course of optimisation (they are knapsack 
problems) and solved using CLP. The patterns give rise to columns of the ‘master 
problem’ to be solved by LP or IP. Another application is to maximising the through- 
put in a telecommunications network. Here different messages are routed through a 
network using different frequencies. There is a limit to the capacity of any one path. 
Different combinations of frequencies can be incorporated, within a capacity limit, 
along any path. Possible combinations can be found by CLP. The overall (discrete) 
multicommodity network flow model is then solved by IP. 

4.5.3.3 Benders Decomposition 

This is a powerful method of solving (usually mixed IP) structured models. For 
mixed IPs the method fixes the integer variables at trial values then solves the resul- 
tant LP. The result provides a bound on the optimal objective value of the original 
model. This is then incorporated into an IP model for finding the best values of 
the integer variables as an extra constraint (a ‘Benders cut’). This IP is sometimes 
efficiently solved by CLP. Again the approach is usually problem specific. 
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4.9 Solving Integer Programmes as Satisfiability Problems 

Most practical IP problems give rise to MIPs and are most efficiently solved using 
standard IP algorithms based on the LP relaxation and branch-and-bound augmented 
by cutting planes. However, the use of logical methods is very valuable in the for- 
mulation of these models. Badly built models may be prohibitively difficult to solve 
(even if they are accurately built). Logic provides the rigour and systematisation 
necessary. 

PIP models arise less frequently although there are a number of classic applica- 
tions. Usually the variables are restricted to be 0-1 although, so long as they are 
bounded and restricted to small integers, they can be converted to 0-1 variables by 
the method described in Chapter 2. 

The 0-1 PIP models can always be converted into satisfiability problems, and 
sometimes be solved efficiently as such by, e.g. the Davis-Putnam-Loveland proce- 
dure. Even if they are not solved in this way their conversion into a logical form pro- 
duces cutting planes which may be incorporated into the conventional IP methods. 

In order to illustrate the method we consider a small 0-1 PIP example. 

Example 4.6 Convert the following model Into a satisfiability problem and solve it 
by the Davis-Putnam-Loveland (DPL) procedure: 

Maximise 2\x\ -|- 32x2 -P 45x3 + 15x4 + 19xs -f- 42xe 
-t- 37xg -p 50x9 + 47xio 

subject to 20xi -P 34x2 -p 42x3 + 1 1-«4 + 48xs -p 39xe 
-p 38xjo ^ 63 

19xi -p 35x2 + 23x3 + 23x4 + 39xs 4- 35xe 
-P 33xio < 72 

Xi, X2, . . . , Xio € {0, 1} 



We represent the setting of each 0-1 variable x,- to 1 or 0 by the truth or falsity of 
an atomic statement X, . Each of the constraints imposes restrictions on the possible, 
combined, values of the x, , which may be stated in compound statements involving 
the Xi. For example (4.132) demonstrates that both X2 and X3 cannot be 1 since 
the sum of their coefficients in (4.132) exceeds the right-hand side. This can be 
represented by the statement X2 V Z3. 

In order to, systematically, find all such clauses implied by (4.132) it is conve- 
nient to place all the variables in ascending order of their coefficients as follows: 

1 Ix4-p20xi-p23xg-p34x2-p38xi0-p39x6-p42x3-p46x9-p48x5-p53x7 ^ 63 (4.134) 



+ 15x7 

(4.131) 

-P 53x7 + 23xg + 46x9 

(4.132) 

-P 45x7 + 61xg -p 47x9 

(4.133) 



Working from the left we can see if 
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i. There are any single statements which must be false (i.e. whose coefficients 
exceed the right-hand side). In this example there are none. 

ii. There are any pairs which cannot simultaneously be true. This gives the follow- 
ing conjunction of disjunctions: 



X4 V X7 



■Xi 


V 


X9 


■Xi 


V 


^5 


■Xi 


V 


X7 


■Xg 


V 


X3 


■Xg 


V 


X9 


Xg 


V 


^5 


■Xg 


V 


X7 


■X 4 


V 


X9 


■X2 


V 


^10 


■X2 


V 


^6 


■X2 


V 


X3 


■X2 


V 


X9 


■X2 


V 


^5 


■X2 


V 


X7 



•XioVXe 
•Xio V X3 
•Xio V X9 
■Xio V Xs 
■Xio V X-i 
■Xs V X3 
■Xfi V X9 
•Xfi V X5 
■Xfi V Xv 
•X3 V X9 
■X3 V X5 
■X3 V X7 
■X9 V X5 
■X9 V X7 
■X5 V X7 



( 4 . 135 ) 



We could now proceed to enumerate all triples, the sum of whose coefficients 
exceeds the right-hand side, and give the corresponding logical statement and then 
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all subsets of cardinality 4 and so on. Of course it would be redundant to enumerate 
sets whose members contain all those already represented by a subset of the set. 
The resulting disjunction would be absorbed by the smaller disjunction. If we were 
to proceed in this manner we would eventually capture the full import of constraint 
(4.132) in a conjunction of disjunctive clauses. However, to avoid the potentially 
explosive build up in clauses we will not proceed beyond pairs of cardinality 2 at 
this stage. 

Carrying out a similar procedure for constraint (4.133) produces some of the 
disjunctions in (4.135) together with 



X 2 VZ 8 (4.136) 

■Xe V Xg 
•Xg V Xio 
■X9 V Xio 
■Xi vXg 
■X4 V Xg 



Again we will, temporarily, content ourselves with disjunctions of pairs of 
literals. 

In order to deal with the objective we place constraints on its value and formulate 
the resulting logical implication. 

In this example the maximum possible value of the objective function is 323 (the 
sum of the objective coefficients) and the minimum value is 0. An efficient way of 
proceeding is to do a ‘binary search’ and progressively dissect the range of values 
in which the optimal objective value might lie. Therefore we begin by seeing if half 
the maximum objective value is obtainable. This is 161 (with rounding). Therefore 
we impose the trial constraint 



2lxi32x2 + 45x3 + 15x4 + 19x5 + 42xe + 15x7 + 37xg + 50x9 + 47xio ^ 161 

(4.137) 



In terms of complemented variables (x, = 1 — x, ) this may be written as 



21xi + 32x2 + 45x3 + 15x4 + 19x5 + 42xe + 15x7 + 37xg + 50x9 + 47xio ^ 162 

(4.138) 

We treat this constraint, as with the others, and express it in a logical form. 

The smallest subsets of these (complemented) variables, which give rise to logi- 
cal conjunctions, are of cardinality 4 giving 
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X 1 VX 3 VZ 9 VX 10 (4.139) 
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2 f 3 


V 


Tfg 


V 


Z9 


V 


2fio 



Again we will not proceed, at this stage, with enumerating the larger clauses. 

We take the conjunction of (4.135), (4.136) and (4.139) and see if it is satisfiable. 
Since resolution (together with absorption) is a ‘complete refutation procedure’ it is 
possible to determine if a statement, such as this, is unsatisfiable without resorting 
to the branching in the DPL procedure. 

We resolve the fourth disjunction in (4.139), and its successive resolvents, with 
each of V A9, Z3 V X9, X2 V X9, Zm V X9 in turn to produce Z9. Similarly we 
resolve the 10th disjunction of (4.139) with each of X(, V Z3, Zio V Z3, Z3 V Z9, 
Zg V Z3 in turn to produce Z3. Resolving the fifth disjunction of (4.139) with 
Z2 V Zg, Ze V Zg, Zg V Z3, Zg V Z9 in turn produces Zg. Resolving the eighth 
disjunction of (4.139) with Z2VZ10, Z2VZg, Z2VZ6, Z2VZ3 in turn produces Z2. 
These four resolvents resolve with the second disjunction in (4.139) to produce an 
empty clause (a contradiction), showing the objective of 161 is unobtainable. 

Note that we have managed to show that a ‘relaxation’ (only using some of the 
implied disjunctive clauses) is unsatisfiable and that, therefore, the full logical repre- 
sentation is unsatisfiable, i.e. it was unnecessary to generate all the implied disjunc- 
tive clauses. We could have sufficed with even fewer clauses if we had a judicious 
way of choosing them. 

We therefore further dissect the possible objective values from 0 to 80 by impos- 
ing the constraint 

21xi + 32x2 + 45 J 3 -|- 15x4 + 19x5 + 42xg + 15x7 + 37xg + 50x9 + 47xio ^ 243 

(4.140) 

From this the smallest clause we can create is 

Z2 vZg vZg VZ3 vZio VZ9 (4.141) 

Appending this to (4.135) and (4.136) and applying the DLP procedure 
(Exercise 4.12.24) shows the compound statement to be satisfiable. A satisfying 
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set of truth values corresponds to the solution is 

xi = X 4 = xio =1, X 2 — — x$ — xe — xi — Xi — x<) — Q (4.142) 

However, since the logical statement is a relaxation of the full problem we must 
check if this solution satisfies the constraints (4.132), (4.133) and (4.140). It clearly 
breaks them all. Therefore we must add further disjunctions to rule out this solution. 
We begin by appending only 

Z 1 VZ 4 VX 10 (4.143) 

The DPL procedure again shows this to be satisfiable with a satisfying set of truth 
values 

xi = X3 = X4 = 1 , X2 = X5 = xg = X7 = xg = X9 = xio = 0 ( 4 . 144 ) 

But it can be seen that this solution breaks (4.134). We therefore append the 
disjunction 

X 1 VX 3 VX 4 (4.145) 

Applying resolution demonstrates that the full statement is not satishable, show- 
ing that an objective value of 80 is not attainable. 

We therefore further dissect the possible objective values to lie between 0 and 40 
by logically representing 

21xi + 32x2 + 45 x 3 + 15x4 + 19x5 + 42xe -t- 15x7 + 37xg + 50x9 + 47xio ^ 283 

(4.146) 

The smallest clauses implied by this constraint are 

Xi VX2 V A3 vXe VX7 V Ag VX9 vXio ( 4 . 147 ) 

Xi V A2 V A3 V A4 V Ag V Ag V A9 V Ajo 

A2 V A3 V A5 V Ag V A7 V Ag V A9 V Ajq 

A2 V A3 V A4 V A5 V Ag V Ag V A9 V Aio 

Appending (4.147) to (4.135) and (4.136) and applying the DPL method shows 
the resulting statement to be satisfiable with the two possible solutions 

xi = xg = 1, X 2 = X 3 = X 4 = X 5 = X 7 = xg = X 9 = xio = 0 (4.148) 

and 



Xl = Xlo = 1 , X2 = X3 = X4 = X5 = Xg = X7 = Xg = X9 = 0 



(4.149) 
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Both these solutions satisfy the constraints of the original IP model. 

The first solution results in an objective value of 63 and the second in an objective 
value of 68 showing solution (4. 149) to be optimal. 

If we had only produced one of the satisfying solutions (and not checked if there 
were others) we could have proceeded by dissecting the interval between the corre- 
sponding objective value and 80 progressively seeking, or showing the absence of 
solutions with larger objective values. 

The full execution of use of resolution and the DPT procedure in this example is 
left as Exercise 4.12.24. 

An alternative approach to examples such as this would be to use the LP relax- 
ation to obtain upper bounds on the optimal objective value at each stage. These 
bounds could then be used as the right-hand side on the (objective) constraint. Also 
IP constraints could then be produced as ‘cutting planes’ from some of the resul- 
tant logical clauses in the manner described in Sect. 4.3. The application of this 
procedure to Example 4.6 is left as Exercise 4.12.26. 



4.10 Applications 

In this section we describe some practical problems that are particularly suited to 
being viewed as satisfiability problems, or extensions of this problem. They are 
often most satisfactorily solved using the methods of IP. 



4.10.1 Electrical Circuit Design Using Switches 

Suppose we wish to allow electric current to pass through a set of wires only if 
certain combinations of switches are on. There will be many ways of achieving this. 
Normally we will seek an ‘economical’ solution. Possible objectives are as follows: 



i. Minimising the number of switches. 

ii. Minimising the wire length. 

iii. Minimising the number of junctions. 



Eor example. Pig. 4.8 illustrates a circuit, with redundancies, which represents 
the logical statement (in DNP): 



Zi • Z2 V Xi • Z2 V A2 • X3 



(4.150) 



Simplifying (4.150) (by consensus and absorption) we obtain 



Zi V A2 ■ X3 



(4.151) 
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Fig. 4.8 A redundant electrical circuit 



These two prime implicants represent the smallest statement in terms of both 
the number of clauses and number of literals. The result translates into the circuit 
in Fig. 4.9. 





^1 



Fig. 4.9 A simplified electrical circuit 

For more complicated circuits we would need to specify the exact objective and 
use the methods described in Section 4.5. 



4.10.2 Logical Net Design Using Gates 

Electronic components exist to perform different logic functions. In particular 
if one has components which perform one of the complete connectives, dis- 
cussed in Chapter 1, then one can construct any logic function out of them 
by connecting together a number of these components. For example a com- 
mon gate used performs the NOR function (the connective arrow ‘j,’)- It is 
shown in Fig. 4.10. 

This gate gives a signal Y when, and only when, neither input X\ nor input X 2 
is present, i.e. the logical function Xi ■ X 2 . By connecting together such gates one 
can perform any logical function. For example the logical statement 
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NOR 















V 



Fig. 4.10 A NOR gate 



(Xi V Zz) ■ (Xi V X2) 



(4.152) 



(known as ‘exclusive or’) is realised by the net in Fig. 4.1 1. 



^1 X2 




Fig. 4.11 The exclusive OR function 

The minimum DNF or CNF realisation of (4.152) is of little use in designing 
a structure with the minimum number of gates to perform the function. Another 
‘normal form’ is needed. However, the problem can be formulated as an IP and is 
set as Exercise 4.12.20. In fact Fig. 4.1 1 is the minimal representation. 

It is worth pointing out that gates, such as NOR, can be connected to each other to 
give control systems with a ‘memory’. For example Fig. 4.12 represents a ‘flip-flop’. 



X, 



NOR 



y, 






X, 



NOR 



K, 



Fig. 4.12 A flip-flop 
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If the Xi signal comes before X 2 then the first NOR gate is switched off. The 
only output will be Y 2 . On the other hand if X 2 precedes Xi the only output will be 
Yi The logic of optimally designing systems with memory is beyond the scope of 
this book but can be achieved by IP. 

Many other gates are manufactured in practice performing functions such as 
NAND, XOR (exclusive or), OR, etc. 

A particularly important class of gates is known as ‘threshold gates’. These have 
a number of inputs (n) which multiply the 0-1 input signals by given amounts 
■ ■ • , and only produce an output if the combined input reaches some desig- 
nated threshold level b or more. A threshold gate is illustrated in Fig. 4.13. 




Fig. 4.13 A threshold gate 

If the inputs are represented by 0-1 variables x\, X 2 , ■ ■ ■ , x„ then the gate pro- 
duces only an output if the knapsack constraint 

a\X\ + a2X2 -h • • • -+- a„Xn > b (4.153) 



is satisfied. 

It is possible to create all the logic functions by individual or combinations of 
threshold functions. Exercise 4.12.22 addresses this question but further discussion 
is beyond the scope of this book. 

All the design problems discussed here are usually considered under the heading 
of VLSI (very large-scale integrated circuit) design. 



4.10.3 The Logical Analysis of Data (LAD) 

A major application of computational logic is to finding ‘economical’ (or minimum 
size) logical formulae which can be used to predict a result from certain input data. 
Important areas are disease diagnosis, credit scoring, pattern recognition, data com- 
pression in order to transmit or store digital data and the design of safety systems. 
The data are converted into a finite set of true/false observations. If necessary con- 
tinuous measurements are converted into 0-1 statements, e.g. 
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Y = True if and only if a ^ y ^ b (4.154) 

We motivate the discussion by a medical example. 

Example 4. 7 Create a minimal logical formula to diagnose the occurrence of a dis- 
ease as a result of the following observations: 

‘T’ and ‘F’ denote positive and negative symptoms, respectively, and indi- 
cates irrelevance of the symptom for the particular test. 

In order to obtain a logical formula for predicting the disease we first express the 
presence of a symptom S, by a literal X, . 

Table 4.2 is represented in DNF as follows: 



Zi • X2 • X4 V Zi • X2 • X4 ■ Xs V • X2 • X3 • X4 V • X3 • X4 ( 4 . 155 ) 



Table 4.2 Disease Present 



Symptoms 


Si 


S 2 


S 3 


S 4 


S 5 


Test 1 


T 


T 


- 


T 


- 


Test 2 


F 


T 


- 


F 


T 


Test 3 


F 


T 


T 


T 


- 


Test 4 


- 


T 


T 


F 


- 



Applying consensus and absorption we obtain the complete disjunction of prime 
implicants as 



Xi ■ Z2 • Z4 V Xi ■ X2 • Z4 ■ X5 V X2 • Z3 ( 4 . 156 ) 

By means of a truth table or IP we can show that none of the prime implicants 
is redundant. Therefore (4.156) represents the minimum equivalent statement to 
Table 4.2 in DNF. However, we also need to ensure that Table 4.3 is consistent 
with Table 4.2. 



Table 4.3 Disease Not Present 

Symptoms Si S2 S3 S4 S5 

Test 5 T T T F - 

Test 6 T F T - F 

Test 7 F F - - - 



Table 4.3 is represented by 



■ Z2 • X3 • X4 V • A2 • X3 • Z5 V Xi • X2 ( 4 . 157 ) 

Since this statement needs to be false, if the patient has the disease, we negate it 
using De Morgan’s laws to give (in CNF) 



(Ai V X2 V X3 V A4) • (Xi V X2 V Z3 V Z5 ) • (Ai V X2) 



(4.158) 
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If the patient has the disease this statement must be trae to be consistent with 
(4.156). 

Equation (4.158) can be simplihed to 



(Zi V X 2 V X 3 V Z 4 ) • (X 2 V X 3 V X 5 ) • (Zi V X 2 ) (4.159) 

Expressed in DNE this is (after simplihcation and taking out redundant prime 
implicants) 



Xi-X 2 V X2 • X3 V X2 • X4 V Xi-X 2 - X 3 V Xi- X 2 - X 5 (4.160) 

We must now check that the conjunction of (4.160) with (4.156) is consistent 
(satishable). This is left as Exercise 4.12.23. 

Once it is shown to be satishable we can use (4.156) to dehne a new (smaller) set 
of tests for diagnosing the disease. This is given in Table 4.4. 



Table 4.4 New Tests for the Disease 

Symptoms Si S 2 S 3 S 4 S 5 

Test 1 T T - T - 

Test 2 F T - F T 

Test 3 T - T - - 



As mentioned in Sect. 4.6 the Ending of the minimum logical expression can 
be very difficult for larger examples and heuristics often have to be resorted to in 
order to End suboptimal solutions. However, the Ending of the minimum expression 
is highly desirable in many of these problems in view of the cost or utility of the 
resulting test regime (in the case of LAD). 



4.10.4 Chemical-processing networks 

In this situation it is sometimes desirable to represent and analyse the logical 
dependence of different processes before solving the associated problem by IP. We 
illustrate this by the very small example in Eig. 4.14. 




Fig. 4.14 A chemical-processing network 
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PI, P2 and P3 are chemical processes into and out of which are flows of material 
whose quantities are represented by the corresponding x, variables. The processes 
have maximum capacities of inputs of Ml , M2 and M3, respectively. Their outputs 
are their inputs multiplied by a 1 , 0:2 and «3, respectively. It is not possible to operate 
PI and P2 simultaneously. If we represent whether each Pt is operating or not by 
0-1 variables y, then we have the following constraints: 



xi — X2 — X3 = 0, X6 — X4 — X5 = 0 (4.161) 

X4 — «1X2 = 0, X5 — «2T3 = 0, X7 — «3X6 = 0 (4.162) 

X2 — M\y\ <0, X3 — Miyi <0, xe — M^y^ < 0 (4.163) 

yi + T2 < 1 (4.164) 



In addition we will have costs on xi, X3 and X5 and a revenue on x? as well as 
(fixed) costs on y\, y^ and ^3. 

As explained in Chapter 2 the standard version of the branch-and-bound algo- 
rithm relies on dichotemies based on branching on 0-1 variables. However in this 
type of application there is merit in analysing the logical relationships between 
the processes and representing them in a ‘simple’ form allowing more useful 
dichotemies. We do this using logical notation. Also it is worth representing, explic- 
itly, the obvious condition (which would be satisfied at optimality) that if process 
P3 is not used then it is not worth using PI or P2. The relationships between the 
processes can then be modelled as follows: 

(i^ V 1^) ■ ((Ti V T 2 ) ^ J's) (4.165) 

The 0-1 variables can be used to model the satisfaction of the two clauses. 
Branching on these variables before the other variables is advantageous. 



4.10.5 Other Applications 

Truss design involves designing a structure, at minimum cost, to support objects of 
a given weight and size. Some bars in the structure are supported by other bars. 
Hence there is a logical dependency of the former on the latter which can be formu- 
lated using logical statements. The objective function will be represented as a linear 
combination of the literals. 

Optimal pit limits involves finding the ‘best’ volume of earth and rock to extract 
from an open-cast mine in order to maximise profit. If the mine is ‘discretised’ by 
conceptually visualising the volume in discrete blocks there is a logical relationship 
between the blocks which should be excavated. In order to be able to extract the 
richer ore ‘uneconomic’ blocks above (overburden) must be extracted first. In total 
there will be an optimum combination of blocks which should be extracted to max- 
imise total profit. An extension of the problem is to find the optimal order within 




4. 1 1 References and Further Work 



141 



which to extract blocks in order to maximise discounted cash flow. Again one must 
observe the logical dependency between the blocks to be extracted. 

Designing safety precaution measures, e.g. fire prevention, nuclear safety or sen- 
sor placement, can be formulated as a logical problem. Certain measures (e.g. lock- 
ing or creating barriers and routine inspections) may or may not be implemented in 
conjunction with others. The detection of a potential hazardous event will depend 
on a logical combination of events. It may be desirable to express the logical rela- 
tionship in as economical a manner as possible. 

Database access languages (such as the widely used System Query Language 
(SQL)) allow the user to make a logical statement involving ‘OR’, ‘AND’ and 
‘NOT’ concerning the combination of tables to look up when retrieving information. 
In order to make the search of very large databases as fast as possible (and not make 
mistakes in the specification) it is desirable to represent the logical statement as 
compactly as possible using a normal form. Therefore logical manipulation by the 
user, or the system being used, is desirable. Data mining is an area where these 
considerations are very relevant. 



4.11 References and Further Work 

As discussed in Chapter 2 the satisfiability problem is the standard problem used for 
comparison in defining a measure of computational complexity since many prob- 
lems can be converted to it. The classic reference here is Garey and Johnson [41]. 
Pardalos et al. [88] edit a book devoted to the satisfiability problem. 

Hooker [54] treats IP in the framework of logical inference. The resolution prin- 
ciple is due to Robinson [95] (in the form of statements in CNF). However, Quine 
[93, 94] was the first to define the method. He applied it to statements in DNF. Davis 
and Putnam [31] created the method under their names which was later extended by 
Loveland [75]. Jeroslow and Wang [65] give one of the best heuristics for choice of 
branching literal. 

The representation of logical problems as IPs was presented early on by 
Williams [112]. 

Hooker [54, 53] discusses the similarities (but non-equivalence) of prime impli- 
cations and /ace A. 

Quine [95] discusses the problem of finding the minimum sum of prime impli- 
cants (the DNF form of finding the minimum sum of prime implications) but does 
not provide a viable method for other than very small problems. 

Truemper [105] discusses the computational side of solving some of the prob- 
lems in this chapter. 

Horn clause systems form the basis of a number of logical languages used by 
expert systems. One of the earliest is PROLOG (see Kowalski [70]). Dowling and 
Gallier [34] describe how to solve the satisfiability problem for Horn clauses in 
linear time. Gallo and Scutella [40] discuss ‘easy’ satisfiability problems. 
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Constraint logic programming is described in many books such as Tsang [106], 
Van Hentenryck [108] and Milano [83]. Hooker [57] discusses the relationship with 
IP and lists many of the global predicates used. Yan and Hooker [128] discuss the 
formulation of the cardinality constraint as an IP and Williams and Yan [120] discuss 
the alLdifferent constraint. Hybrid systems are discussed by Hooker [56]. 

Benders decomposition is due to Benders [14] and discussed in many textbooks 
such as Nemhauser and Wolsey [ 86 ] and Martin [80]. Geoffrion [42] describes a 
practical implementation of Benders decomposition. 

Granot and Hammer [47] describe the conversion of PIP models to satisfiabil- 
ity problems and Hooker [55] gives the method of appending clauses as needed. 
An example of a ‘difficult’ pure 0-1 model (based on the British Petroleum/Shell 
demerger over a number of products and sites) and the value of appending ‘cover’ 
constraints are described in Williams [113]. 

Problems of circuit design are discussed in, for example, Weste and Harris [109]. 
Threshold functions are described by Muroga [84]. 

The logical analysis of data and its applications are described by many authors 
such as Boros et al. [19] and Anthony and Hammer [2] to name only a few. 

Grossmann et al. [49] describe logical problems arising in chemical engineering 
and their solution aided by logical methods. Bollapragada et al. [17] describe the 
truss design problem. Williams [110], Jayawardane et al. [59] and Boland et al. 
[16] (among others) discuss the optimal pit limits problem. 

The Systems Query Language (SQL) is described in Emerson et al. [36]. 



4.12 Exercises 



4.12.1 Test if Example 4.1 is satisfiable using the DPL procedure. 

4.12.2 Show by the application of resolution and absorption that the following 
statement is not satisfiable: 

(Al V A 2 ) • (^ V ^ V ^) • (^ V A 3 ) • (^ V ^) • (^ V ^ V A 4 ) 



4.12.3 Show that the application of the DPL procedure to a satisfiable statement in 
CNF must result in a satisfiable set of truth values. 

4.12.4 Apply the DPL procedure to find a satisfiable set of truth values for 



(Al V A 2 V A 3 ) • (Al V A 3 V A 4 ) • (A 2 V A 3 ) • (A 3 V A 4 ) 



4.12.5 Convert the problem of determining the satisfiability of 



(Al V A 2 V A 3 ) • (Al V A 2 V A 4 ) ■ (Al V A 2 V A 3 ) • (Al v A 2 v A 4 ) 
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to an IP. Solve it by the branch-and-bound method (using the minimisation of the 
sum of variables as an arbitrary objective) starting with the LP relaxation. Use a 
suitable computer package if desired. 

4 . 12.6 Before solving the model in 4.12.5 by branch-and-bound apply the constraint 
form of resolution and absorption to simplify the model. Contrast the LP relaxation 
with that of 4.12.5. 

4 . 12.7 Show that, in general, resolution applied to disjunctive clauses is equivalent 
to the application of addition and rounding to the equivalent constraints (including, 
possibly, the 0-1 bounding constraints). 

4 . 12.8 Show that adding logical constraints, with more than one common variable 
differing in sign, and possibly rounding (if necessary adding 0-1 constraints) results 
in a valid but redundant extra constraint. 

4 . 12.9 Construct small 0-1 IP examples to show that facets do not exactly corre- 
spond to prime implications and vice versa. 

4 . 12.10 Show that, in general, resolution applied to the pigeon hole problem will 
result in an exponential number of steps (as a function of the number of variables). 

4 . 12.11 Show that input resolution only requires a linear number of steps (as a 
function of the number of variables). 

4 . 12.12 Express constraints (4.59)-(4.67) as a conjunction of disjunctive clauses. 

4 . 12.13 Create the minimum logical statement, in DNF, which represents the fol- 
lowing table of truth values: 

X2 Z3 X4 

T F T - 
- T F F 
F - T - 

4 . 12.14 Create the minimum logical statement, in CNF, which represents the nega- 
tion of that in 4.12.13. 

4 . 12.15 Represent all the prime implications of the statement in 4.12.14 as the 
nodes of a graph. Draw directed arcs from each pair of prime implications whose 
consensus is another prime implication. 

4 . 12.16 Model the condition x ^ {vi, V 2 , ■ ■ . , i>„) using IP constraints. 

4 . 12.17 Model the predicate alLdiff(xi, X 2 , ■ ■ ■ , x„] using IP constraints in two 
different ways. 

4 . 12.18 Represent the following statement: 

[(B ^ (C — > A)) ^(B^ A)] 
as an electrical circuit where each literal corresponds to a switch. 
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4 . 12.19 Represent the statement in 4.12.18 by means of a net of NOR gates. 

4 . 12.20 Formulate the problem of minimising the number of NOR gates to represent 
the ‘exclusive OR’ function as an IR Solve this model (using a suitable package 
program if desired). 

4 . 12.21 Design an electrical circuit that will only give a positive response when at 
least two out of hve possible votes (registered by switches) of a committee are cast. 

4 . 12.22 Formulate the general problem of minimising the number of threshold 
gates, each with three inputs, connected together to represent a logical statement 
in DNF as an IP. 

4 . 12.23 Check that statements (4.156) and (4.160) are consistent (satisfiable). 

4 . 12.24 Carry out all the steps of Example 4.6 using resolution and the DPL 
procedure. 

4 . 12.25 Program the method described in Example 4.6 for solving PIP models 
as satisfiability problems using all-integer arithmetic in a suitable programming 
language. 

4 . 12.26 Solve Example 4.6 using LP relaxations and ‘logical cuts’. Use a suitable 
LP package. 
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